Logica is a logic language for data manipulation developed by Google. It can be compiled into several SQL dialects, including BigQuery, SQLite and PostgreSQL. SQL, as a programming language, has several design flaws and suffers from bad readability especially when it comes to complex data queries. Logica has been aimed at mitigating those problems with SQL, and this post attempts to show that it does a quite good job in enhancing code readability and reusability, taking the SQL Murder Mystery as an example.
For each step, a series of Logica predicates I write, the SQL query the last Logica predicate is compiled to, and the tabular query results will be given. The first query simply retrieves relevant crime scene reports.
SELECT crime_scene_report.* FROM `crime_scene_report`AS crime_scene_report WHERE crime_scene_report.type = "murder"AND crime_scene_report.city = "SQL City";
date
type
description
city
20180215
murder
REDACTED REDACTED REDACTED
SQL City
20180215
murder
Someone killed the guard! He took an arrow to the knee!
SQL City
20180115
murder
Security footage shows that there were 2 witnesses. The first witness lives at the last house on “Northwestern Dr”. The second witness, named Annabel, lives somewhere on “Franklin Ave”.
SQL City
Only the last row seems to offer us useful information. We first need to ascertain the two witnesses mentioned. A combine clause is used to compute the maximum address number on “Northwestern Dr”. Notice that giving a field an alias of a constant expression can concisely restrict the value of the field. Also notice that A LIKE B expression in SQL is written as Like(A, B) in Logica. Predicate A includes the information of the witnesses, which is thereafter joined with the interview table to retrieve the transcripts. When joining tables, joining keys does not have to be specified explicitly as in SQL, only required to have the same alias.
The complied SQL query is obviously more verbose, though it is very much the way how a human would write it. The subquery in place of the combine clause is harder to read, and when using UNION ALL we need another FROM clause which is repetitive. The WITH structure to a certain extent improves readability, but not as much as the use of separate, sequential predicates in Logica.
SELECT person.* FROM `person`AS person WHERE person.address_number = (SELECT MAX(t_2_person.address_number) AS logica_value FROM `person`AS t_2_person WHERE "Northwestern Dr" = t_2_person.address_street_name) AND person.address_street_name = "Northwestern Dr" UNIONALL
SELECT t_4_person.* FROM `person`AS t_4_person WHERE (t_4_person.name LIKE"%Annabel%") AND t_4_person.address_street_name = "Franklin Ave"
) AS UNUSED_TABLE_NAME ) SELECT A.id ASid, A.name ASname, interview.transcript AS transcript FROM t_0_A AS A, `interview`AS interview WHERE interview.person_id = A.id;
id
name
transcript
14887
Morty Schapiro
I heard a gunshot and then saw a man run out. He had a “Get Fit Now Gym” bag. The membership number on the bag started with “48Z”. Only gold members have those bags. The man got into a car with a plate that included “H42W”.
16371
Annabel Miller
I saw the murder happen, and I recognized the killer from my gym when I was working out last week on January the 9th.
It turns out that we only need to use the first interview to find out the murderer. Here predicate A pins down the clues regarding the gym and B the car plate, both of which are joined to give the answer. Notice that the complied SQL query joins three tables directly, which can be circumvented using a @With predicate as shown in later steps.
SELECT get_fit_now_member.person_id ASid, get_fit_now_member.name ASname FROM `get_fit_now_member`AS get_fit_now_member, `drivers_license`AS drivers_license, `person`AS person WHERE get_fit_now_member.membership_status = "gold"AND (get_fit_now_member.id LIKE"48Z%") AND (drivers_license.plate_number LIKE"%H42W%") AND person.id = get_fit_now_member.person_id AND person.license_id = drivers_license.id;
id
name
67318
Jeremy Bowers
value
Congrats, you found the murderer! But wait, there’s more… If you think you’re up for a challenge, try querying the interview transcript of the murderer to find the real villain behind this crime. If you feel especially confident in your SQL skills, try to complete this final step with no more than 2 queries. Use this same INSERT statement with your new suspect to check your answer.
A simple query to retrieve the interview of the murderer.
1 2
A(..r) :- `interview`(..r), r.person_id == 67318;
1 2 3 4 5 6
SELECT interview.* FROM `interview`AS interview WHERE interview.person_id = 67318;
person_id
transcript
67318
I was hired by a woman with a lot of money. I don’t know her name but I know she’s around 5’5″ (65″) or 5’7″ (67″). She has red hair and she drives a Tesla Model S. I know that she attended the SQL Symphony Concert 3 times in December 2017.
Now predicate A is regarding the event and B driving license, and C joins them together with person. In A, an aggregation is used to count the number of times a person is checked in at the given event name during Dec 2017. (The dates are given in integers in the database.) I leave the “3 times” constraint to the joining predicate in a concise way. Notice that we can explicitly demand that Logica compile the subqueries individually without mixing them up in a large join query using a @With predicate for better efficiency in execution (though it doesn’t make much difference here).
WITH t_0_A AS (SELECT facebook_event_checkin.person_id AS person_id, SUM(1) AScount FROM `facebook_event_checkin`AS facebook_event_checkin WHERE (facebook_event_checkin.date >= 20171201) AND (facebook_event_checkin.date <= 20171231) AND facebook_event_checkin.event_name = "SQL Symphony Concert" GROUPBY person_id), t_1_B AS (SELECT drivers_license.* FROM `drivers_license`AS drivers_license WHERE (drivers_license.height >= 65) AND (drivers_license.height <= 67) AND drivers_license.hair_color = "red"AND drivers_license.gender = "female"AND drivers_license.car_make = "Tesla"AND drivers_license.car_model = "Model S") SELECT A.person_id ASid, person.name ASname FROM t_0_A AS A, t_1_B AS B, `person`AS person WHERE A.count = 3AND person.id = A.person_id AND person.license_id = B.id;
id
name
99716
Miranda Priestly
value
Congrats, you found the brains behind the murder! Everyone in SQL City hails you as the greatest SQL detective of all time. Time to break out the champagne!
Problem solved. It can be clearly seen above that writing Logica predicates instead of SQL queries greatly bolsters code readability and reusability.
This article was last updated on days ago, and the information described in the article may have changed.