Chapter 53 – Style¶
Note
You are reading a digital copy of SQL-99 Complete, Really, a book that documents the SQL-99 standard.
The book and the standard does not reflect the features of CrateDB, we are just publishing it as a service for the community and for reference purposes.
On the one hand, CrateDB does not implement the SQL-99 standard thoroughly, on the other hand, it extends the standard for implementing cluster features and others.
For more information specific to CrateDB, check out the CrateDB Reference documentation.
Here is one SQL operation, written in two contrasting styles. Notice the different names, spacing, indentation and choice of optional keywords:
[CASUAL]
SELECT * FROM T WHERE C1>5 ORDER BY 1 ASC;
[FORMAL]
SELECT package_id, special_handling_code, buy_date
FROM Packages
WHERE package_id > 1.0
ORDER BY package_id;
Casual style is better for notes, examples, blackboard discussions and
prototypes. We use it in this book whenever we enclose an SQL statement in an
illustrative sentence, as when we say that SELECT * FROM T WHERE C1>5 ORDER
BY 1 ASC
is an example of casual style. We don’t wear suits on beaches and we
believe that casual style is appropriate for our purposes. On the other hand, a
serious program demands some formality. There is one good reason and one bad
reason.
Appearance – Okay, this is the bad reason. Formal style persuades others, perhaps even yourself, that you used some sort of organization and design.
Coherence – You can read statements more easily if you know in advance the shape, order and vocabulary. More time is spent “reading” code than “writing” it, so you will save time by investing time at the start.
Besides, maybe you do wear suits on beaches.
A True Story
There once was a clever DBMS named O_____ (tm). The people who program O_____
were mulling in their tall building in Redwood Shores CA one day. “Sometimes we
have to run the same SQL statements twice”, they mulled. “So why should we
parse the same statement again when we already know what the Access Plan is
from the first time we ran it?” And they came up with a clever plan. They put
the first SQL statement in a cache! Then, when the second SQL statement came
along, they had a clever way of telling that it was really something they’d
done before (maybe with a parameter or two different but that doesn’t affect
the story). And they just re-used the same Access Plan and saved oodles of
time. They even used this clever plan for Views. Which was great. Except that
their way of comparing the new SQL statement with the cached SQL statement was,
well, simple. What they did was: they compared every byte in the cached
statement with every byte in the new statement. That’s it. No lower-to-upper
conversions, no lead-space trimming, nothing but a REP CMPSB
(that’s
strcmp
to you C fans).
What happened next? Well, in all the good-doobie programming shops, where everybody wrote every detail according to the same style rules … why, the programs ran faster. But in all the other programming shops, where freedom and creativity ruled, nothing happened.
The moral of the story is: choose a common style, then stick to your style.
Table of Contents
Layout Rules¶
1– Capitals¶
Write SQL <keyword>s in upper case. Write <Table name>s with an initial capital, small letters after that; when a <Table name> has two parts, use an initial capital for each part. Write <Column name>s in lower case. For example:
CREATE TABLE Widgets (
width SMALLINT);
UPDATE Temporary_Personnel SET
discharge_date = CURRENT_DATE;
Names of other, non-Column, Objects should also have an initial capital. For example:
GRANT UPDATE
ON Widgets
TO Sandra, Maria;
Exception: If a name is an acronym – e.g.: NATO or SQL or NIST – use all capital letters.
Alternative: Some write everything in lower case – it’s easier to remember. And we’ve seen examples written entirely in upper case – but not recent examples. You should remember that <regular identifiers> are stored in upper case, but it’s been a long time since any DBMS insisted that Object names must be entered in upper case in the first place.
2– Spaces¶
Put a single space after every SQL – that is, after every <keyword> and
after <identifier>s and after operators such as *
or +
or =
.
For example:
SELECT width * 5 ...
Exception: There need be no spaces after scalar or set function names, or within parentheses. For example:
SELECT MAX(width) ...
Exception: There should be no space before comma (,) or semicolon (;). There should, however, be a space after the comma. For example:
SELECT width, length ...
... DECIMAL (7, 2) ...
Exception: There should be no space at the end of a line.
Exception: In a qualified name, there should be no space before or after the period (.).
Alternative: Some would omit spaces around comparison operators or arithmetic operators. For example:
... (a=b) OR (b=c) ...
3– Line Breaks¶
[ Skip this rule if you use SQL/CLI or direct SQL]
There are alternate rules for “when to switch to a new line”, usually called
“breaking”. The common rule is to break on every clause-start <keyword>. In a
SELECT
statement, the clause-start <keyword>s are: SELECT
itself,
FROM
, ON
, WHERE
, GROUP
, HAVING
, ORDER
; the Boolean
operators AND / OR
are usually considered to be equivalent to clause-start
<keyword>s. Additionally, some prescribe a break after every full expression.
For example:
SELECT width,
length+5
FROM Widgets,
Foobars
WHERE width > 5
AND length >= 5
ORDER BY width;
Others will place multiple items on a line if there aren’t too many to fit. For example:
SELECT width, length + 5
FROM Widgets, Foobars
WHERE width > 5
AND length >= 5
ORDER BY width;
Notice the indentation at the start of each line – or lack thereof! It is rare
to find, say, a SELECT
statement starting on Column #1 but all subsequent
clauses starting on Column #3. In any case, to accomplish the lined-up effect,
one uses spaces rather than tab characters. The indentation seen in the above
SELECT
statements is to Column position 10 to accommodate ORDER BY
;
with non-SELECT
statements indent is random.
Exception: One group prefers to right-justify the main <keyword>s. For example:
SELECT width, length + 5
FROM Widgets, Foobars
WHERE width > 5
AND length >= 5
ORDER BY width;
There is a universal rule in all languages: if it’s nested, indent. So further indentation will be necessary for subqueries. For example:
SELECT width, length, width + 10
FROM Widgets
WHERE width <> 7
AND length =
(SELECT expression
FROM t);
Notice the position of the ( preceding the word SELECT
.
Alternative: Put the ( at the end of the previous line. However, placing a closing ) on the same line as the end of the statement is normal – even though this differs from the way many C programmers use {} braces and differs from the way many Pascal programmers write begin … end blocks.
Alternative: Put the initial <keyword> of a major clause on a line of its
own, as is done with WHERE
and AND
in this example:
SELECT length
FROM widgets
WHERE
(rowid = 5)
AND
(specialty = 'FUN');
Note also the different indentation of the conditional clauses.
Alternative: Put ORed
conditions on the same line, but break for
AND
. For example:
SELECT length
FROM widgets
WHERE
(rowid = 5 OR rowid = 6 OR rowid = 7)
AND width > 1;
Alternative: Add line breaks for each level of a nested function. For example:
SELECT SUBSTRING(
UPPER(title)
FROM 1 TO 5)
FROM Books;
Make a new line for UNION
. Treat AND NOT
as a single operator.
3a– Continuation Lines¶
When room is unavailable on a line, break if possible at a comma or equal sign or other operator <keyword>. If there are too many <Column name>s to fit in one line, break thus:
SELECT Column_1, Column_2, Column_3, Column_4, Column_5,
Column_6, Column_7, Column_8, Column_9, Column_10 ...
Alternative: There is a minority view that commas should be shifted. For example:
SELECT Column_1, Column_2, Column_3, Column_4, Column_5
,Column_6, Column_7, Column_8, Column_9, Column_10 ...
3b– Indenting CREATEs¶
In a CREATE TABLE
statement, every new Column goes on a new line. When it
helps, you may indent so that each part of the <Column definitiion> is at the
same position on the line – but nobody does that all the time. Here’s an
example:
CREATE TABLE Transactions
(transactionid INTEGER NOT NULL,
amount DECIMAL(7, 2),
partid INTEGER,
comments VARCHAR(3000));
Usually CREATE TABLE
statements will also have Constraint clauses. We have
split them out here, adding the Constraints in later ALTER TABLE
statements. Constraints are in fact separate from Tables, but if you find that
splitting up the Table definition into separate statements is unacceptable,
you’re not alone. The other part of this illustration shows our preference for
giving names to everything, including Constraints – see Rule 16. Here’s an
example:
ALTER Table Transactions
ADD CONSTRAINT transaction_primary_key
PRIMARY KEY (transactionid);
ALTER Table Transactions
ADD CONSTRAINT transaction_references_inventory
FOREIGN KEY partid REFERENCES Inventory(partid)
ON UPDATE CASCADE
ON DELETE CASCADE;
3c– Indenting INSERTs¶
Here is an INSERT
statement formed according to the same rules as
discussed. Notice that the SELECT
in the example is not indented like a
subquery would be.
INSERT INTO Recreation_players
(playerno, name, town, phoneno)
SELECT playerno, name, town, phoneno
FROM Players
WHERE leagueno IS NULL;
For streams of INSERT
statements, one relaxes the rules to squeeze to one
line. For example:
INSERT INTO Personnel VALUES (1, 2, 3, 'Maximum');
INSERT INTO Personnel VALUES (7, 4, 166, 'Minimum');
INSERT INTO Personnel VALUES (15, -6, 0, NULL);
It might appear nice to line up the values in this example, but that’s not what people do.
3d– Indenting UPDATEs¶
If we apply the rules consistently, then an UPDATE
statement should look
like this:
UPDATE Contacts
SET first_grade = 'A', second_grade = 'B', third_grade = 'C';
Alternative: The more common style is to break for each assignment. For example:
UPDATE Contacts
SET first_grade = 'A',
second_grade = 'B',
third_grade = 'C';
4– Statement End¶
End statements with a semicolon without a preceding space. For example:
COMMIT;
Exception: Where the semicolon is inappropriate, omit it (for example, in COBOL shops or where the vendor won’t accept it).
6– Qualifiers¶
When an SQL statement contains references to more than one Table, use <Column reference>s rather than <Column name>s. This is particularly true for joins. For example:
SELECT Widgets.length, Foobars.width
FROM Widgets, Foobars
WHERE Widgets.length = Foobars.width;
Not only does the qualification of a <Column reference> help the reader see
which Table a Column belongs to, it guards against later breakage of the code
(if, for instance, a new Column named WIDTH
is someday added to
FOOBARS
). Sometimes the qualification may have to include Schema and
<Catalog name>s too. If qualification starts to get at all lengthy, use
<Correlation name>s.
7– Shorthands¶
7a– Shorthands for lists¶
Do not use SELECT * ...
to mean “SELECT all Columns ...
”. List the
Columns you want by name. For example:
SELECT length, width
FROM Widgets;
Exception: In the set function COUNT(*)
, the asterisk is necessary and
in EXISTS (SELECT * ...
the asterisk is preferred.
Do not use an INSERT
statement without a Column list. List the Columns you
want by name. For example:
INSERT INTO Widgets
(length, width)
VALUES (1, 2);
Exception: Streams of INSERT
statements contain no Column list, see
rule 3c.
Do not use GRANT ALL PRIVILEGES
or REVOKE ALL PRIVILEGES
for a
Privilege list. List the Privileges you want by name. For example:
GRANT SELECT, UPDATE
ON TABLE Widgets
TO Sandra, Joan;
7b– Shorthands for Expressions¶
Usually, expression shorthands involve learning new syntax. For example,
COALESCE(a,b)
is short for CASE WHEN a IS NOT NULL THEN a ELSE b END
.
But we’d guess that some people would have to look up COALESCE
to find out
what it means. On the other hand, they might be able to puzzle out the longer
CASE
expression, because they’ve seen similar constructs in most other
computer languages. The consensus seems to be to use the longer expression,
rather than the shorthand – unless the shorthand itself is a common and
well-understood construct.
8– Short forms¶
For <data type>s, use short forms: CHAR
rather than CHARACTER
,
VARCHAR
rather than CHARACTER VARYING
, INT
rather than INTEGER
,
BLOB
rather than BINARY LARGE OBJECT
.
Speaking of shortness – though this has nothing to do with Rule 8 – a
too-long name is: Parts_Which_Have_No_Serial_Numbers
.
9– Redundancy¶
9a– Noise <keyword>s¶
Where a <keyword> is optional and eliminating it would cause no change in meaning, eliminate it. One example:
GRANT UPDATE, INSERT
ON Widgets ...
instead of:
GRANT UPDATE, INSERT
ON TABLE Widgets ...
Another example:
SELECT width
FROM Widgets
ORDER BY width;
instead of:
SELECT width
FROM Widgets
ORDER BY width ASC;
Another example:
COMMIT;
instead of:
COMMIT WORK;
Another example:
SELECT width
FROM Widgets;
instead of:
SELECT ALL width
FROM Widgets;
Remember Shannon and information theory: when a word adds nothing to the meaning, it is not information. It is noise.
Exception: It’s never bad to add unnecessary parentheses if there is any chance that a reader might not guess what the precedence of operators might be. For example:
SELECT (width * 5) + 4
FROM Widgets;
Exception: Although UNION DISTINCT
is not in common use, it is clear
that SQL3’s designers believe that explicitly saying DISTINCT
is good.
9b– Superfluous Clauses¶
Most SQL programmers are willing to say the same thing twice “to make the
meaning clearer”. We give two examples of this bad but normal practice. The
first shows a superfluous NOT NULL
clause:
CREATE TABLE Widgets
(width INT NOT NULL,
CONSTRAINT widget_pkey PRIMARY KEY(width));
In SQL-92 and SQL3, a primary key is automatically NOT NULL
.
The second example shows a superfluous predicate:
SELECT width
FROM Widgets
WHERE spoffo BETWEEN 'A' AND 'AZZZ'
AND spoffo LIKE 'A%';
The BETWEEN
clause is unnecessary. It’s probably there for “optimization”
reasons which are outside the scope of this chapter.
9c– Explicitizing¶
You don’t need to start any program with the SQL statement:
CONNECT TO DEFAULT;
because the DBMS would CONNECT TO DEFAULT
anyway. So should you bother?
According to one DBMS expert: yes. In general, if some critical process is
implicit (performed automatically as default behaviour), you might do good by
making it explicit (specified in the instruction). You’re making your
intentions clear not only to the reader, but also to the DBMS, so this act is
more than a mere comment. In this view, the first SQL executed statement should
be CONNECT
.
10– Literals¶
Enter <exact numeric literal>s using maximum scale but without lead zeros and
without leading +
signs. For example:
UPDATE Widgets
SET maximality = 10.00;
Exception: When using <literal>s in arithmetic expressions, use the scale
that you want to the result to have. (Note: If you want to be emphatic about
what specific numeric <data type> you are using, consider using CAST
.)
Even if a search of a character string Column is probably case-insensitive, use both upper and lower case as you would if you were inserting. For example:
SELECT surname
FROM Widgets
WHERE surname = 'Smith';
Do not put trailing spaces in <character string literal>s unless they are
necessary for comparisons with PAD SPACE
Collations.
For binary items, use X'....'
rather than B'....'
notation. For
example:
INSERT INTO Widgets (bobbet)
VALUES (X'427A');
11– Specify Character Sets¶
We can’t call this “common practice” because we haven’t seen much of _introducer use, but it would be consistent with the preceding to say that if a character string has, or will have, non-Latin letters, and the default Character set is not obvious, specify the Character set.
12. Statement Splitting¶
Most SQL programmers are willing to write very long SQL statements. There is some practical justification for this tendency: (a) if any form of “auto-commit” is in effect, then splitting up SQL statements could leave the database in an inconsistent state and (b) most DBMSs optimize at the statement level, so putting everything in one statement might provide useful information to the optimizer.
Alternative: A minority view (which we espouse) holds that separate thoughts
belong in separate sentences, as in any ordinary language. For example, we’ve
suggested before that it’s a good idea to add Constraints later (with ALTER
TABLE
), rather than mix all Constraints with <Column definition>s in the
original CREATE TABLE
statement.
13– Impossibilities¶
Consider this example of a CASE
expression:
CASE Column_of_doom
WHEN > 5 THEN '>5'
WHEN <= 5 THEN '<=5'
END
It’s hard to be sure, but it looks like the writers didn’t ask “what if
COLUMN_OF_DOOM
is NULL
?”. There should be an explicit ELSE
clause,
here to allow for that. Defensive programmers code for the “default” or
“otherwise” case, even if the case can’t possibly happen.
14– Precise Comparisons¶
Comparisons with >
and <
operators are sometimes vaguer than they need
be. For example:
name > 'X' /* what if somebody is named 'X'? */
position < 1 /* you mean position <= 0? */
By rephrasing the comparison with a >=
or <=
operator, you can
sometimes catch such problems.
15– Distributing NOTs¶
“Neither a borrower nor a lender be.” – Polonius
Instead of saying “be not a borrower or a lender” Polonius said “neither a borrower nor a lender be” – using a separate negation word for each negated thing. This was an application of one of DeMorgan’s Rules:
NOT (A OR B) can be changed to NOT(A) AND NOT(B)
NOT (A AND B) can be changed to NOT(A) OR NOT(B)
Since the changed form is closer to the way that people actually talk, it is easier to read.
Naming Rules¶
Everyone says that onomatopoeia is the oldest profession. Or, at least, they would say that, if they knew that onomatopoeia originally meant “the making of names”, and that Adam’s first job was to name all the beasts in the Garden of Eden.
16. Give Everything a Name¶
The DBMS often lets you skip giving an Object a name: it just assigns a default name. But this default name is arbitrary. And besides, no two DBMSs use the same rules for default names. So, give explicit names to expressions in select lists. For example:
SELECT (length + width) AS length_and_width
FROM Widgets;
Consider giving explicit names to Constraints in CREATE TABLE
, ALTER
TABLE
, CREATE DOMAIN
and ALTER DOMAIN
statements. If you don’t, how
will you drop the Constraints later? And how will you interpret the
diagnostics, which include <Contstraint name>s? Here’s an example:
CREATE TABLE Widgets
(length INT,
CONSTRAINT Widgets_Length_Checker CHECK (length > 0));
Exception: Usually one does not give names to simple Column Constraints
like NOT NULL
or PRIMARY KEY
.
Actually, naming is just one prominent example of a case where the DBMS will assign some “implementation-dependent” value if you don’t specify one yourself. In all such cases, it’s probably safer to specify.
17– When a Name Has Two Parts, Separate the Parts with the Underscore Character (_)¶
For example: ytd_sales initial_extent
Alternative: For <Table name>s especially, you can keep the parts
unseparated but capitalize the second word. For example: OrderItems
DepartmentNumbers
.
18– Avoid Names That Might Be Reserved Words in Some SQL Dialect¶
The way to do this is to use names that refer to objects in the real world that
you’re modelling with your database. You can be fairly sure that names like
CandyStores
, book_title
or swather
are not names that the DBMS
needs for its own purposes. If you must be absolutely sure, you can take
further measures – but there are problems with all of them.
You can use the list of <keyword>s, in our chapter on general SQL concept. This list includes reserved words used in major SQL dialects, as well as reserved words used in all standard SQL variations at the time of printing. It’s better to look it up here rather than depend on a vendor’s manual. But it’s impossible to keep such a list up to date.
You can check by passing to your DBMS an SQL statement containing the <identifier> and looking for an error message. For example, try to execute something like “
CREATE TABLE
<word> (<word>INT
);” If the SQL statement works, <word> is not a reserved word. However, this won’t tell you if some other DBMS reserves that word, or if the next version of your DBMS will reserve it.You can put underscores (
_
) in names. This is unpopular. The SQL Standards committee doesn’t intend to add <keyword>s containing underscores in any future SQL version. However, there are some exceptions: words that begin withCURRENT_
orSESSION_
orSYSTEM_
, or words that end with_LENGTH
. Underscores have special meanings when used with introducers, withLIKE
predicates and withSIMILAR
predicates. The SQL Standards committee will also avoid <keyword>s containing digits in all future versions. So tryMussels4
. But first read Rule 25.You can enclose all names with quotes (
""
). But <delimited identifier>s cause their own problems: see Rule 19.
19– Avoid <delimited identifier>s¶
The troubles with them are, first, that double quote marks are false signals to many people who are used to thinking that quote marks appear around strings instead of names. Second, there’s case sensitivity – “X” is not the same as “x”. Third, quote marks are ugly.
Exception: <Table name>s might require <delimited identifier>s, because
some DBMSs use files for Tables. File names include special characters – .
or /
or \
or :
– that are illegal in regular <identifier>s.
Exception: Microsoft Access programmers often use <delimited identifier>s
for <Table name>s (Access is a non-standard SQL which uses []
s instead of
""
s to mark the delimitation).
Exception: Applications which generate SQL statements, such as user
interfaces, might automatically enclose all <delimited identifier>s inside
""
s.
With all these exceptions, you might decide to take the minority line and use
<delimited identifier>s regularly. If you do, at least avoid names that have
lead or trailing spaces. Some DBMSs’ processes include an automatic TRIM
.
20– Names of Tables are Plural; Names of All Other Objects Are Singular¶
Thus, in the INFORMATION_SCHEMA
, we have a View named SCHEMATA
and the
Columns of this View are: CATALOG_NAME
, SCHEMA_NAME
and so on. Often a
plural is a collective noun, for example: INVENTORY
. Admittedly, this means
that <Table name>s will be longer (at least in English), but it’s a subtle
signal that distinguishes <Table name>s from other <identifier>s.
Alternative: The dissenting minority points out that the English phrases
for many tabular items are singular: "ADDRESS BOOK"
(not "ADDRESSES
BOOK"
), "PHONE BOOK"
, "INVESTMENT PORTFOLIO"
, "RESTAURANT LIST"
,
etc.
21- Use Words in Your National Language¶
The fact that SQL <keyword>s look like English is irrelevant. For example, this sample SQL statement appeared in an article in a Polish magazine:
UPDATE studenci SET nazwisko='Kowalski';
This does mean that names will sometimes include characters outside the regular
English alphabet. Obviously the effect on portability is unfortunate, but if
your DBMS doesn’t support accented characters in names then it doubtless won’t
properly support them in data values either, so why would you use such a DBMS
anyway? Precisely because you don’t know what a nazwisko
is, you can see
that a Pole would have trouble understanding the word that you use instead of
nazwisko
.
22– Don’t Worry about How <Column name>s Appear When a Table Is Displayed on the Screen¶
That’s something that changes anyway (use AS
clauses). Instead, worry about
how names appear if you print out a program. Remember: The goal is long-term
comprehension, so ephemeral considerations such as screen-display deserve low
priority.
23– Names Should Be Descriptive, but Not Too Descriptive¶
Minimally, you should avoid algebra like UPDATE k SET k1=4
– where no one
could possibly guess what k
and k1
are supposed to represent. Medianly,
you should avoid non-specific descriptors like PHONE_NUMBER
– where no one
can be sure whether the referent is a home- or office- or general-contact-
telephone number. But stop there! Avoid names like SOLDIERS_IN_THE_ARMY
because (presumably) all the soldiers in the database are in the army; the “in
the army” bit is only helpful if you also have soldiers in the navy and you
have to distinguish between them. This part of the rule – avoid making
accidents part of the identification – is analogous to one of the
normalization rules.
24– If Two Columns from Different Tables Are Based on the Same Domain, They Should Have the Same Name¶
In fact, they should have the Domain’s name. For example:
CREATE DOMAIN surname VARCHAR(25);
CREATE TABLE Students (surname surname, ...);
CREATE TABLE Professors (surname surname, ...);
This rule would apply even if your DBMS doesn’t support explicit creation of Domains, or if you use SQL3’s user-defined type feature – you’re still using the concept of Domains.
Exception: This rule does not apply for two Columns in the same Table.
Incidentally, when <Column name>s are the same, NATURAL JOIN
is easier.
That’s usually a blessing, but some caution is required – you certainly don’t
want to cause a join over two Columns which have the same name by accident.
25– Digits Are a Bad Sign¶
Too often we use digits as arbitrary distinguishers – e.g.: Lines_1 /
Lines_2
– when there is some intrinsic difference between Lines_1
and
Lines_2
that could be expressed in the names, for example,
Lines_Freshwater
and Lines_Longitude
. Particularly bad are the digits
'0'
and '1'
, which look too much like the letters 'O'
and 'l'
.
26– Try to Stop a Name at 18 Characters; The Maximum Length Allowed in SQL-89¶
Mainly, it’s hard to remember a long name. For example, do you remember if the
name mentioned in rule 8 was Parts_Which_Have_No_Serial_Numbers
? Or was it
Parts_Which_Have_No_Serialnumber
?
27– Repeat the <Table name> in the <Column name> … Not¶
Firstly, you’d end up violating rule 24. Secondly, if you make a View of the
Table you’ll have to either violate this rule, or make View <Column name>s not
equal to Table <Column name>s. For example, the INFORMATION_SCHEMA
View
called GRANTS
has a Column called IS_GRANTABLE
instead of
GRANT_IS_GRANTABLE
. Remember, if you really need to make it clear what
Table the Column is in, you can use a <Column reference>:
GRANTS.IS_GRANTABLE
.
Exception: A Column which is part of the primary key of the Table could
include the <Table name> in the singular. For example, the
INFORMATION_SCHEMA
View called SCHEMATA
has a Column called
SCHEMA_NAME
– and any foreign keys that reference SCHEMATA
would be
Columns called SCHEMA_NAME
too (assuming that Views could have such
Constraints). There are several conflicting conventions for foreign keys. In
any case, though, it is not part of your mandate to ensure that all <Column
name>s in the database must be unique.
28– Depend on a Dialect … Not¶
This can be subtle, e.g.: UCASE
is a function name that some people
seem to think is standard SQL (in fact it’s ODBC). Write with
lowest-common-denominator syntax when you can, but test it first with an
SQL3 parser to make sure you’re not going to violate a rule when you
upgrade.
29– Sometimes <Correlation name>s (Or Aliases) Are Simply Necessary Because the Actual <Table name> Is Unwieldy, Containing Qualifiers or Lengthy Path Names¶
In
Oracle, use of <Correlation name>s actually helps the optimizer. But should you
always use <Correlation name>s? No – they’re most appropriate in SELECT
statements where <Column name>s must be qualified.
30– Abbreviations¶
Legacy SQL code has frequent abbreviations: PROV
for PROVINCE
, DEPT
for DEPARTMENT
, LEN
for LENGTH
, FNAME
for FIRST NAME
and so
on. Judging from trends in other computer languages, this taste will become
obsolete. At this moment it’s still a matter of taste. A few abbreviated
prefixes/suffixes are used for some common Domains: _id
for single-Column
candidate key (e.g.: author_id
, program_id
), _no
for ordinal number
(e.g.: player_no
, receipt_no
), qty_
for quantity (e.g.:
qty_of_goods_sold
), avg_
for average (e.g.: avg_qty_of_goods_sold
),
min_
for minimum (e.g.: min_weight
), max_
for maximum (e.g.:
max_length
) and sum_
for total (e.g.: sum_balance
). Notice that
some of the prefixes are derived from SQL <keyword>s.
Examples of <Domain name>s/<Column name>s. Some names that we have seen in use in databases for banking/libraries /retail shops/government include:
firstname
,lastname
orsurname
,street
,houseno
,aptno
orunitno
orsuiteno
,city
,state
orprovince
,country
,phoneno
oremail
,sex
,birth_date
,account_id
,balance
,account_open_date
,account_close_date
,transaction_code
,author_firstname
,author_lastname
,title
,callno
,isbn
,year_published
,checkout_date
,loan_type
,amount
,itemno
,transaction_time
,transaction_code
.
Certainly we’ve seen many other names too, in many styles. We picked ones that generally fit the criteria that we’ve described heretofore.
31– Host Language Conventions¶
There is certainly an argument that this C code snippet looks fine:
EXEC SQL
INSERT INTO Recordings (szRecording)
VALUES (:szRecording);
The point here is that the C host variable szRecording
associates with the
SQL Column szRecording
. Hence the same name. In general we could say that
SQL Object names are often influenced by conventions used in the most common
host language, such as C in this case. We don’t condemn this practice, we just
ignore it, since our concern is SQL conventions rather than host language
conventions.
One detail about the szRecording
in the preceding example: it’s in a Polish
notation, that is, the sz in the name indicates the data type (string zero). We
will concern ourselves solely with the question: is it good SQL to embed <data
type> information in names, for example szrecording
or name_char
or
(more subtly) namestring
? The answer, judging as usual from what seems to
be common practice, is yes that’s okay, but nobody is doing so systematically.
Sometimes we do see <Column name>s that end in _date
or num[eric]
, but
we don’t see consistency.
For the narrower concept – Domains – we have Rule 24.
32– User Names¶
If you have control over user names, prefer first names: Ralph, Mike, Lucien. Where necessary add the first letter of the last name: JeanC, RalphK, LucienB. This convention appears to derive from names on the Internet.
Often there is no choice in this regard, because the operating system feeds user names to the DBMS.
33– Comma Lists¶
Whenever a list is disordered, people wonder why. For example, these SQL statements look a trifle curious:
SELECT firstname, travel_allowance, surname
FROM SalesPersons;
SELECT *
FROM States
WHERE state_abbreviation IN('WY','MI','AK','CO');
If there is some hidden order, add a note explaining what it is. Otherwise, change to a natural or alphabetical order.
Examples of Statements in Formal Style¶
Here are some actual SQL statement examples. We have not edited them to fit all the rules in this chapter.
ALTER Table Countries
ADD gnp DECIMAL(8, 2);
COMMIT;
CONNECT TO 'c:\db';
CREATE TABLE players
(playerno SMALLINT NOT NULL PRIMARY KEY,
name CHAR(15) NOT NULL ,
leagueno SMALLINT DEFAULT 99);
ALTER Table players
ADD Constraint check_playerno
CHECK (playerno BETWEEN 00 AND 99);
CREATE VIEW ages (playerno, age) AS
SELECT playerno, 1986 - YEAR_OF_BIRTH
FROM Players;
DELETE FROM Order_Items
WHERE partid IN (
SELECT partid
FROM Inventory
WHERE description LIKE 'd%');
GRANT SELECT, UPDATE
ON Jackets
TO Carol, Kathleen;
INSERT INTO Gradings (gradeno, inspectorid, description)
VALUES (3, ?, 'Prime');
INSERT INTO Temporary_Workers (workerid, name, town, phoneno)
SELECT workerid, name, town, phoneno
FROM Workers
WHERE benefit IS NULL;
SELECT title
FROM Videos
WHERE out_date > DATE '1994-07-06'
GROUP BY title;
HAVING COUNT(*) > 1
ORDER BY title;
SELECT accountid, balance
FROM Accounts
WHERE (town = 'Amesville'
OR balance < 0);
AND NOT (town = 'Amityville'
AND balance < 0);
SELECT NAME, TOWN
FROM PLAYERS
WHERE TOWN IN ('Inglewood', 'Stratford')
SELECT first_name, last_name
FROM Students
WHERE studentno IN
(SELECT studentno
FROM Delinquent_Students
WHERE excuse IN ('sick','unhappy','deprived')) UNION
SELECT first_name, last_name
FROM Teachers
WHERE teacher_status = 'sick';
SELECT Realtors.name AS realtor_name, Vendors.name AS vendor_name
FROM Members_Of_Real_Estate_Board Realtors, Sellers_Of_Farm_Land Vendors
WHERE Realtors.name = Vendors.contact_name;
UPDATE Addresses
SET street = ?,
houseno = ?,
town = ?,
state_or_province = ?,
zip_or_postal_code = ?,
country = ?
WHERE CURRENT OF Selection;
SELECT title, release, censorrtg, runtime
FROM title
WHERE runtime BETWEEN 120 AND 231
ORDER BY release DESC
Host Language Programs¶
Some programmers keep SQL statements apart from host language statements, in separate procedures (or even separate modules). Others allow mixing, as in this (embedded Pascal SQL) example:
EXEC SQL SELECT COUNT(*) INTO :counter FROM Specifiers;
if (counter=0) Writeln('Empty Table!');
if (counter>0) begin
EXEC SQL UPDATE Specifiers SET process_count = process_count + 1;
end;
The following style notes apply to either SQL/CLI or to embedded SQL.
34– Host Language Variable Names Should Be Similar to Corresponding SQL <Column name>s but Not Identical¶
35– Put Comments in the Host Program (Not Inside the SQL Statement) Using the Host Program’s Comment Style¶
36– Employ Assertions¶
Here we use the word “assertion” in the non-SQL sense: a conditional statement that you’d like to have in the debug version but not in the production version.
SQL is interpretive, so all “asserting” has to take place at runtime. In
programs, the best method is to add executable SQL statements with
#if/#endif
host language directives.
Caution
Some SQL precompilers ignore #if
and #endif
. For example, this
assertion example checks that Column PROGRAMS
. SUMMARY
has matching
definitions in both C and SQL. The format of the SQLExecDirect
call is
not a common style; you should take it merely as a suggestion.
...
#define SUMMARY_LENGTH [500]
SQLCHAR summary[SUMMARY_LENGTH+1];
#if DEBUG_ON
SQLINTEGER character_maximum_length;
#endif
...
/* assertion start */
#if DEBUG_ON
character_maximum_length = 0;
SQLExecDirect(hstmt,"SELECT CHARACTER_MAXIMUM_LENGTH\
FROM INFORMATION_SCHEMA.COLUMNS\
WHERE TABLE_NAME = 'PROGRAMS'\
AND COLUMN_NAME = 'SUMMARY'",
SQL_NTS);
SQLFetch(hstmt);
SQLGetData(hstmt,1,SQL_INTEGER,&character_maximum_length,NULL,NULL);
if (character_maximum_length <> SUMMARY_LENGTH) exit(1);
if (SQLFetch(hstmt) != SQL_NO_DATA) exit(1);
#endif
/* assertion end -- if you survive to here, things are all right */
SQLExecDirect(hstmt,"SELECT summary FROM Programs",SQL_NTS);
The following style notes apply only to SQL/CLI.
37– Use Conventional Names¶
For example, resource handles are henv
, hdbc
, hstmt
and hdesc
.
When there is more than one stmt
, use ordinals: hstmt1
, hstmt2
and
so on.
38– Declare Variables with Constants or Macros Supplied in sqlcli.h¶
For example:
#include "sqlcli.h"
#define NAMELEN 50
...
SQLCHAR name[NAMELEN];
SQLCHAR create[] = "CREATE TABLE NameID (\
id INT,name CHAR(50))";
...
39– If an SQL Statement Contains Constants Which Are Also Used in the Host Program, Check or Edit the Statement at Runtime¶
For example, add this to the last example:
...
sprintf(create,"CREATE TABLE NameID(id INT,name CHAR(%d))",NAMELEN);
...
40– When Testing a Function’s Return Values, Programmers Use Various Styles¶
For example:
if (SQL_ERROR == SQLAllocStmt (hdbc,&hstmt)) goto error;
if (sqlreturn = (SQLExecute(hstmt)) < 0) {
printf("sqlreturn = %ld\n",sqlreturn);
exit(1); }
Summary¶
Whatever you write, may later be read.
Whatever style you choose for serious programs, stick with it consistently.
5– Comments¶
A simple comment begins with a double minus sign (--) and ends at the next line. Unfortunately, if an SQL statement comes from keyboard input, then the dialog manager will strip the line breaks. And, OSs disagree whether a line break is one character (LF) or two (CR+LF). Because of the danger that presents to the parser, many eschew all comments in the SQL statement and put them in the host language code. For example:
The problem disappears if your DBMS supports SQL3, which allows C-like comments – i.e.: comments that begin with /* and end with */. Although C-like comments are far from universal, they are the preferred style among Microsoft SQL Server users. Occasionally they are even used for section headings, thus:
Speaking of comments, Weinberg (The Psychology of Computer Programming) suggested that code and comments should be written in separate columns. This would make it easier to focus on the code when debugging (if a program has bugs then the comments are probably lies).