Rules Reference¶
Rules in sqlfluff are implemented as crawlers. These are entities which work their way through the parsed structure of a query to evaluate a particular rule or set of rules. The intent is that the definition of each specific rule should be really streamlined and only contain the logic for the rule itself, with all the other mechanics abstracted away.
Specific Rules¶
Standard SQL Linting Rules.
-
class
Rule_L001
(code, description, **kwargs)¶ sqlfluff fix
compatible.Unnecessary trailing whitespace.
Anti-patternThe • character represents a space.SELECT a FROM foo••
Best practiceRemove trailing spaces.SELECT a FROM foo
-
class
Rule_L002
(code, description, **kwargs)¶ Mixed Tabs and Spaces in single whitespace.
Configurationtab_space_size: The number of spaces to consider equal to one tab. Used in the fixing step of this rule. Must be one of range(0, 100).This rule will fail if a single section of whitespace contains both tabs and spaces.
Anti-patternThe • character represents a space and the → character represents a tab.In this example, the second line contains two spaces and one tab.SELECT ••→a FROM foo
Best practiceChange the line to use spaces only.SELECT ••••a FROM foo
-
class
Rule_L003
(code, description, **kwargs)¶ sqlfluff fix
compatible.Indentation not consistent with previous lines.
Configurationtab_space_size: The number of spaces to consider equal to one tab. Used in the fixing step of this rule. Must be one of range(0, 100).indent_unit: Whether to use tabs or spaces to add new indents. Must be one of [‘space’, ‘tab’].lint_templated_tokens: Should lines starting with a templating placeholder such as {{blah}} have their indentation linted.. Must be one of [True, False].- Note:
This rule used to be _”Indentation length is not a multiple of tab_space_size”_, but was changed to be much smarter.
Anti-patternThe • character represents a space.In this example, the third line contains five spaces instead of four.SELECT ••••a, •••••b FROM foo
Best practiceChange the indentation to use a multiple of four spaces.SELECT ••••a, ••••b FROM foo
-
class
Rule_L004
(code, description, **kwargs)¶ Mixed Tab and Space indentation found in file.
Anti-patternThe • character represents a space and the → character represents a tab.In this example, the second line is indented with spaces and the third one with tab.SELECT ••••a, → b FROM foo
Best practiceChange the line to use spaces only.SELECT ••••a, ••••b FROM foo
-
class
Rule_L005
(code, description, **kwargs)¶ sqlfluff fix
compatible.Commas should not have whitespace directly before them.
Unless it’s an indent. Trailing/leading commas are dealt with in a different rule.
Anti-patternThe • character represents a space.There is an extra space in line two before the comma.SELECT a•, b FROM foo
Best practiceRemove the space before the comma.SELECT a, b FROM foo
-
class
Rule_L006
(code, description, **kwargs)¶ sqlfluff fix
compatible.Operators should be surrounded by a single whitespace.
Anti-patternThe • character represents a space.In this example, there is a space missing space between the operator and ‘b’.SELECT a +b FROM foo
Best practiceKeep a single space.SELECT a + b FROM foo
-
class
Rule_L007
(code, description, **kwargs)¶ Operators near newlines should be after, not before the newline.
Anti-patternThe • character represents a space.In this example, the operator ‘+’ should not be at the end of the second line.SELECT a + b FROM foo
Best practicePlace the operator after the newline.SELECT a + b FROM foo
-
class
Rule_L008
(code, description, **kwargs)¶ sqlfluff fix
compatible.Commas should be followed by a single whitespace unless followed by a comment.
Anti-patternThe • character represents a space.In this example, there is no space between the comma and ‘zoo’.SELECT * FROM foo WHERE a IN ('plop','zoo')
Best practiceKeep a single space after the comma.SELECT * FROM foo WHERE a IN ('plop',•'zoo')
-
class
Rule_L009
(code, description, **kwargs)¶ sqlfluff fix
compatible.Files must end with a trailing newline.
-
class
Rule_L010
(code, description, **kwargs)¶ sqlfluff fix
compatible.Inconsistent capitalisation of keywords.
Configurationcapitalisation_policy: The capitalisation policy to enforce. Must be one of [‘consistent’, ‘upper’, ‘lower’, ‘capitalise’].Anti-patternIn this example, ‘select ‘is in lower-case whereas ‘FROM’ is in upper-case.select a FROM foo
Best practiceMake all keywords either in upper-case or in lower-caseSELECT a FROM foo -- Also good select a from foo
-
class
Rule_L011
(code, description, **kwargs)¶ sqlfluff fix
compatible.Implicit aliasing of table not allowed. Use explicit AS clause.
Anti-patternIn this example, the alias ‘voo’ is implicit.SELECT voo.a FROM foo voo
Best practiceAdd AS to make it explicit.SELECT voo.a FROM foo AS voo
-
class
Rule_L012
(code, description, **kwargs)¶ Implicit aliasing of column not allowed. Use explicit AS clause.
NB: This rule inherits its functionality from obj:Rule_L011 but is seperate so that they can be enabled and disabled seperately.
-
class
Rule_L013
(code, description, **kwargs)¶ Column expression without alias. Use explicit AS clause.
Configurationallow_scalar: Whether or not to allow a single element in the select clause to be without an alias. Must be one of [True, False].Anti-patternIn this example, there is no alias for both sums.SELECT sum(a), sum(b) FROM foo
Best practiceAdd aliases.SELECT sum(a) AS a_sum, sum(b) AS b_sum FROM foo
-
class
Rule_L014
(code, description, **kwargs)¶ Inconsistent capitalisation of unquoted identifiers.
Configurationcapitalisation_policy: The capitalisation policy to enforce. Must be one of [‘consistent’, ‘upper’, ‘lower’, ‘capitalise’].The functionality for this rule is inherited from
Rule_L010
.
-
class
Rule_L015
(code, description, **kwargs)¶ DISTINCT used with parentheses.
Anti-patternIn this example, parenthesis are not needed and confuseDISTINCT with a function. The parethesis can also be misleadingin which columns they apply to.SELECT DISTINCT(a), b FROM foo
Best practiceRemove parenthesis to be clear that the DISTINCT applies toboth columns.SELECT DISTINCT a, b FROM foo
-
class
Rule_L016
(code, description, **kwargs)¶ sqlfluff fix
compatible.Line is too long
Configurationmax_line_length: The maximum length of a line to allow without raising a violation. Must be one of range(0, 1000).tab_space_size: The number of spaces to consider equal to one tab. Used in the fixing step of this rule. Must be one of range(0, 100).indent_unit: Whether to use tabs or spaces to add new indents. Must be one of [‘space’, ‘tab’].
-
class
Rule_L017
(code, description, **kwargs)¶ sqlfluff fix
compatible.Function name not immediately followed by bracket.
Anti-patternIn this example, there is a space between the function and the parenthesis.SELECT sum (a) FROM foo
Best practiceRemove the space between the function and the parenthesis.SELECT sum(a) FROM foo
-
class
Rule_L018
(code, description, **kwargs)¶ sqlfluff fix
compatible.WITH clause closing bracket should be aligned with WITH keyword.
Anti-patternThe • character represents a space.In this example, the closing bracket is not aligned with WITH keyword.WITH zoo AS ( SELECT a FROM foo ••••) SELECT * FROM zoo
Best practiceRemove the spaces to align the WITH keyword with the closing bracket.WITH zoo AS ( SELECT a FROM foo ) SELECT * FROM zoo
-
class
Rule_L019
(code, description, **kwargs)¶ sqlfluff fix
compatible.Leading/Trailing comma enforcement.
Configurationcomma_style: The comma style to to enforce. Must be one of [‘leading’, ‘trailing’].Anti-patternThere is a mixture of leading and trailing commas.SELECT a , b, c FROM foo
Best practiceBy default sqlfluff prefers trailing commas, however itis configurable for leading commas. Whichever option you choseit does expect you to be consistent.SELECT a, b, c FROM foo -- Alternatively, set the configuration file to 'leading' -- and then the following would be acceptable: SELECT a , b , c FROM foo
-
class
Rule_L020
(code, description, **kwargs)¶ Table aliases should be unique within each clause.
-
class
Rule_L021
(code, description, **kwargs)¶ Ambiguous use of DISTINCT in select statement with GROUP BY.
Anti-patternDISTINCT and GROUP BY are conflicting.SELECT DISTINCT a FROM foo GROUP BY a
Best practiceRemove DISTINCT or GROUP BY. In our case, removing GROUP BY is better.SELECT DISTINCT a FROM foo
-
class
Rule_L022
(code, description, **kwargs)¶ sqlfluff fix
compatible.Blank line expected but not found after CTE definition.
Anti-patternThere is no blank line after the CTE. In queries with manyCTEs this hinders readability.WITH plop AS ( SELECT * FROM foo ) SELECT a FROM plop
Best practiceAdd a blank line.WITH plop AS ( SELECT * FROM foo ) SELECT a FROM plop
-
class
Rule_L023
(code, description, **kwargs)¶ sqlfluff fix
compatible.Single whitespace expected after AS in WITH clause.
Anti-patternWITH plop AS( SELECT * FROM foo ) SELECT a FROM plop
Best practiceThe • character represents a space.Add a space after AS, to avoid confusingit for a function.WITH plop AS•( SELECT * FROM foo ) SELECT a FROM plop
-
class
Rule_L024
(code, description, **kwargs)¶ Single whitespace expected after USING in JOIN clause.
Anti-patternSELECT b FROM foo LEFT JOIN zoo USING(a)
Best practiceThe • character represents a space.Add a space after USING, to avoid confusing itfor a function.SELECT b FROM foo LEFT JOIN zoo USING•(a)
-
class
Rule_L025
(code, description, **kwargs)¶ Tables should not be aliased if that alias is not used.
Anti-patternSELECT a FROM foo AS zoo
Best practiceUse the alias or remove it. An usused alias makes codeharder to read without changing any functionality.SELECT zoo.a FROM foo AS zoo -- Alternatively... SELECT a FROM foo
-
class
Rule_L026
(code, description, **kwargs)¶ References cannot reference objects not present in FROM clause.
Anti-patternIn this example, the reference ‘vee’ has not been declared.SELECT vee.a FROM foo
Best practiceRemove the reference.SELECT a FROM foo
-
class
Rule_L027
(code, description, **kwargs)¶ References should be qualified if select has more than one referenced table/view.
NB: Except if they’re present in a USING clause.
Anti-patternIn this example, the reference ‘vee’ has not been declaredand the variables ‘a’ and ‘b’ are potentially ambiguous.SELECT a, b FROM foo LEFT JOIN vee ON vee.a = foo.a
Best practiceAdd the references.SELECT foo.a, vee.b FROM foo LEFT JOIN vee ON vee.a = foo.a
-
class
Rule_L028
(code, description, **kwargs)¶ References should be consistent in statements with a single table.
Configurationsingle_table_references: The expectation for references in single-table select. Must be one of [‘consistent’, ‘qualified’, ‘unqualified’].Anti-patternIn this example, only the field b is referenced.SELECT a, foo.b FROM foo
Best practiceRemove all the reference or reference all the fields.SELECT a, b FROM foo -- Also good SELECT foo.a, foo.b FROM foo
-
class
Rule_L029
(code, description, **kwargs)¶ Keywords should not be used as identifiers.
Configurationonly_aliases: Whether or not to flags violations for only alias expressions or all unquoted identifiers. Must be one of [True, False].Anti-patternIn this example, SUM function is used as an alias.SELECT sum.a FROM foo AS sum
Best practiceAvoid keywords as the name of an alias.SELECT vee.a FROM foo AS vee
-
class
Rule_L030
(code, description, **kwargs)¶ Inconsistent capitalisation of function names.
Configurationcapitalisation_policy: The capitalisation policy to enforce. Must be one of [‘consistent’, ‘upper’, ‘lower’, ‘capitalise’].The functionality for this rule is inherited from
Rule_L010
.Anti-patternIn this example, the two SUM functions don’t have the same capitalisation.SELECT sum(a) AS aa, SUM(b) AS bb FROM foo
Best practiceMake the case consistent.SELECT sum(a) AS aa, sum(b) AS bb FROM foo
-
class
Rule_L031
(code, description, **kwargs)¶ sqlfluff fix
compatible.Avoid table aliases in from clauses and join conditions.
Anti-patternIn this example, alias ‘o’ is used for the orders table, and ‘c’ is used for ‘customers’ table.SELECT COUNT(o.customer_id) as order_amount, c.name FROM orders as o JOIN customers as c on o.id = c.user_id
Best practiceAvoid aliases.SELECT COUNT(orders.customer_id) as order_amount, customers.name FROM orders JOIN customers on orders.id = customers.user_id -- Self-join will not raise issue SELECT table.a, table_alias.b, FROM table LEFT JOIN table AS table_alias ON table.foreign_key = table_alias.foreign_key
-
class
Rule_L032
(code, description, **kwargs)¶ Prefer specifying join keys instead of using “USING”.
Anti-patternSELECT table_a.field_1, table_b.field_2 FROM table_a INNER JOIN table_b USING (id)
Best practiceSpecify the keys directlySELECT table_a.field_1, table_b.field_2 FROM table_a INNER JOIN table_b ON table_a.id = table_b.id
-
class
Rule_L033
(code, description, **kwargs)¶ UNION ALL is preferred over UNION.
Anti-patternIn this example, UNION ALL should be preferred over UNIONSELECT a, b FROM table_1 UNION SELECT a, b FROM table_2
Best practiceReplace UNION with UNION ALLSELECT a, b FROM table_1 UNION ALL SELECT a, b FROM table_2
-
class
Rule_L034
(code, description, **kwargs)¶ sqlfluff fix
compatible.Use wildcards then simple select targets before calculations and aggregates.
Anti-patternselect a, *, row_number() over (partition by id order by date) as y, b from x
Best practiceOrder “select” targets in ascending complexityselect *, a, b, row_number() over (partition by id order by date) as y from x
Implementation¶
-
class
RuleSet
(name, config_info)¶ Class to define a ruleset.
A rule set is instantiated on module load, but the references to each of it’s classes are instantiated at runtime. This means that configuration values can be passed to those rules live and be responsive to any changes in configuration from the path that the file is in.
Rules should be fetched using the
get_rulelist()
command which also handles any filtering (i.e. whitelisting and blacklisting).New rules should be added to the instance of this class using the
register()
decorator. That decorator registers the class, but also performs basic type and name-convention checks.The code for the rule will be parsed from the name, the description from the docstring. The eval function is assumed that it will be overriden by the subclass, and the parent class raises an error on this function if not overriden.
-
copy
()¶ Return a copy of self with a seperate register.
-
document_configuration
(cls)¶ Add a ‘Configuration’ section to a Rule docstring.
Utilize the the metadata in config_info to dynamically document the configuration options for a given rule.
This is a little hacky, but it allows us to propogate configuration options in the docs, from a single source of truth.
-
document_fix_compatible
(cls)¶ Mark the rule as fixable in the documentation.
-
get_rulelist
(config)¶ Use the config to return the appropriate rules.
We use the config both for whitelisting and blacklisting, but also for configuring the rules given the given config.
- Returns
list
of instantiatedBaseCrawler
.
-
register
(cls)¶ Decorate a class with this to add it to the ruleset.
@myruleset.register class Rule_L001(BaseCrawler): "Description of rule." def eval(self, **kwargs): return LintResult()
We expect that rules are defined as classes with the name Rule_XXXX where XXXX is of the form LNNN, where L is a letter (literally L for linting by default) and N is a three digit number.
If this receives classes by any other name, then it will raise a
ValueError
.
-
-
class
BaseCrawler
(code, description, **kwargs)¶ The base class for a crawler, of which all rules are derived from.
- Parameters
code (
str
) – The identifier for this rule, used in inclusion or exclusion.description (
str
) – A human readable description of what this rule does. It will be displayed when any violations are found.
-
_eval
(**kwargs)¶ Evaluate this rule against the current context.
This should indicate whether a linting violation has occured and/or whether there is something to remember from this evaluation.
Note that an evaluate function shoul always accept **kwargs, but if it relies on any available kwargs, it should explicitly call them out at definition.
- Returns
LintResult
orNone
.
The reason that this method is called
_eval()
and not eval is a bit of a hack with sphinx autodoc, to make it so that the rule documentation auto-generates nicely.
-
crawl
(segment, dialect, parent_stack=None, siblings_pre=None, siblings_post=None, raw_stack=None, fix=False, memory=None)¶ Recursively perform the crawl operation on a given segment.
- Returns
A tuple of (vs, raw_stack, fixes, memory)
-
static
filter_meta
(segments, keep_meta=False)¶ Filter the segments to non-meta.
Or optionally the opposite if keep_meta is True.
-
classmethod
get_parent_of
(segment, root_segment)¶ Return the segment immediately containing segment.
NB: This is recursive.
- Parameters
segment – The segment to look for.
root_segment – Some known parent of the segment we’re looking for (although likely not the direct parent in question).
-
classmethod
make_keyword
(raw, pos_marker)¶ Make a keyword segment.
-
classmethod
make_newline
(pos_marker, raw=None)¶ Make a newline segment.
-
classmethod
make_whitespace
(raw, pos_marker)¶ Make a whitespace segment.
-
class
LintResult
(anchor=None, fixes=None, memory=None, description=None)¶ A class to hold the results of a crawl operation.
- Parameters
anchor (
BaseSegment
, optional) – A segment which represents the position of the a problem. NB: Each fix will also hold it’s own reference to position, so this position is mostly for alterting the user to where the problem is.fixes (
list
ofLintFix
, optional) – An array of any fixes which would correct this issue. If not present then it’s assumed that this issue will have to manually fixed.memory (
dict
, optional) – An object which stores any working memory for the crawler. The memory returned in any LintResult will be passed as an input to the next segment to be crawled.description (
str
, optional) – A description of the problem identified as part of this result. This will override the description of the rule as what gets reported to the user with the problem if provided.
-
to_linting_error
(rule)¶ Convert a linting result to a
SQLLintError
if appropriate.
-
class
LintFix
(edit_type, anchor, edit=None)¶ A class to hold a potential fix to a linting violation.
- Parameters
edit_type (
str
) – One of create, edit, delete to indicate the kind of fix this represents.anchor (
BaseSegment
) – A segment which represents the position that this fix should be applied at. For deletions it represents the segment to delete, for creations it implies the position to create at (with the existing element at this position to be moved after the edit), for an edit it implies the segment to be replaced.edit (
BaseSegment
, optional) – For edit and create fixes, this hold the segment, or iterable of segments to create to replace at the given anchor point.
-
is_trivial
()¶ Return true if the fix is trivial.
Trivial edits are: - Anything of zero length. - Any edits which result in themselves.
Removing these makes the routines which process fixes much faster.
The _eval function of each rule should take enough arguments that it can evaluate the position of the given segment in relation to it’s neighbors, and that the segment which finally “triggers” the error, should be the one that would be corrected OR if the rule relates to something that is missing, then it should flag on the segment FOLLOWING, the place that the desired element is missing.
Inline Ignoring Errors¶
sqlfluff features inline error ignoring. For example, the following will
ignore the lack of whitespace surrounding the *
operator.
a.a*a.b AS bad_1 -- noqa: L006
Multiple rules can be ignored by placing them in a comma-delimited list.
a.a * a.b AS bad_2, -- noqa: L007, L006
It is also possible to ignore non-rule based errors, and instead opt to
ignore templating (TMP
) & parsing (PRS
) errors.
WHERE dt >= DATE_ADD(CURRENT_DATE(), INTERVAL -2 DAY) -- noqa: PRS
Should the need arise, not specifying specific rules to ignore will ignore all rules on the given line.
a.a*a.b AS bad_3 -- noqa