Rules Reference

Rules in sqlfluff are implemented as crawlers. These are entities which work their way through the parsed structure of a query to evaluate a particular rule or set of rules. The intent is that the definition of each specific rule should be really streamlined and only contain the logic for the rule itself, with all the other mechanics abstracted away.

Specific Rules

Standard SQL Linting Rules.

class Rule_L001(code, description, **kwargs)

sqlfluff fix compatible.

Unnecessary trailing whitespace.

Anti-pattern
The • character represents a space.
SELECT
    a
FROM foo••
Best practice
Remove trailing spaces.
SELECT
    a
FROM foo
class Rule_L002(code, description, **kwargs)

Mixed Tabs and Spaces in single whitespace.

Configuration
tab_space_size: The number of spaces to consider equal to one tab. Used in the fixing step of this rule. Must be one of range(0, 100).

This rule will fail if a single section of whitespace contains both tabs and spaces.

Anti-pattern
The • character represents a space and the → character represents a tab.
In this example, the second line contains two spaces and one tab.
SELECT
••→a
FROM foo
Best practice
Change the line to use spaces only.
SELECT
••••a
FROM foo
class Rule_L003(code, description, **kwargs)

sqlfluff fix compatible.

Indentation not consistent with previous lines.

Configuration
tab_space_size: The number of spaces to consider equal to one tab. Used in the fixing step of this rule. Must be one of range(0, 100).

indent_unit: Whether to use tabs or spaces to add new indents. Must be one of [‘space’, ‘tab’].

lint_templated_tokens: Should lines starting with a templating placeholder such as {{blah}} have their indentation linted.. Must be one of [True, False].

Note:

This rule used to be _”Indentation length is not a multiple of tab_space_size”_, but was changed to be much smarter.

Anti-pattern
The • character represents a space.
In this example, the third line contains five spaces instead of four.
SELECT
••••a,
•••••b
FROM foo
Best practice
Change the indentation to use a multiple of four spaces.
SELECT
••••a,
••••b
FROM foo
class Rule_L004(code, description, **kwargs)

Mixed Tab and Space indentation found in file.

Anti-pattern
The • character represents a space and the → character represents a tab.
In this example, the second line is indented with spaces and the third one with tab.
SELECT
••••a,
→   b
FROM foo
Best practice
Change the line to use spaces only.
SELECT
••••a,
••••b
FROM foo
class Rule_L005(code, description, **kwargs)

sqlfluff fix compatible.

Commas should not have whitespace directly before them.

Unless it’s an indent. Trailing/leading commas are dealt with in a different rule.

Anti-pattern
The • character represents a space.
There is an extra space in line two before the comma.
SELECT
    a•,
    b
FROM foo
Best practice
Remove the space before the comma.
SELECT
    a,
    b
FROM foo
class Rule_L006(code, description, **kwargs)

sqlfluff fix compatible.

Operators should be surrounded by a single whitespace.

Anti-pattern
The • character represents a space.
In this example, there is a space missing space between the operator and ‘b’.
SELECT
    a +b
FROM foo
Best practice
Keep a single space.
SELECT
    a + b
FROM foo
class Rule_L007(code, description, **kwargs)

Operators near newlines should be after, not before the newline.

Anti-pattern
The • character represents a space.
In this example, the operator ‘+’ should not be at the end of the second line.
SELECT
    a +
    b
FROM foo
Best practice
Place the operator after the newline.
SELECT
    a
    + b
FROM foo
class Rule_L008(code, description, **kwargs)

sqlfluff fix compatible.

Commas should be followed by a single whitespace unless followed by a comment.

Anti-pattern
The • character represents a space.
In this example, there is no space between the comma and ‘zoo’.
SELECT
    *
FROM foo
WHERE a IN ('plop','zoo')
Best practice
Keep a single space after the comma.
SELECT
    *
FROM foo
WHERE a IN ('plop',•'zoo')
class Rule_L009(code, description, **kwargs)

sqlfluff fix compatible.

Files must end with a trailing newline.

class Rule_L010(code, description, **kwargs)

sqlfluff fix compatible.

Inconsistent capitalisation of keywords.

Configuration
capitalisation_policy: The capitalisation policy to enforce. Must be one of [‘consistent’, ‘upper’, ‘lower’, ‘capitalise’].

Anti-pattern
In this example, ‘select ‘is in lower-case whereas ‘FROM’ is in upper-case.
select
    a
FROM foo
Best practice
Make all keywords either in upper-case or in lower-case
SELECT
    a
FROM foo

-- Also good

select
    a
from foo
class Rule_L011(code, description, **kwargs)

sqlfluff fix compatible.

Implicit aliasing of table not allowed. Use explicit AS clause.

Anti-pattern
In this example, the alias ‘voo’ is implicit.
SELECT
    voo.a
FROM foo voo
Best practice
Add AS to make it explicit.
SELECT
    voo.a
FROM foo AS voo
class Rule_L012(code, description, **kwargs)

Implicit aliasing of column not allowed. Use explicit AS clause.

NB: This rule inherits its functionality from obj:Rule_L011 but is seperate so that they can be enabled and disabled seperately.

class Rule_L013(code, description, **kwargs)

Column expression without alias. Use explicit AS clause.

Configuration
allow_scalar: Whether or not to allow a single element in the select clause to be without an alias. Must be one of [True, False].

Anti-pattern
In this example, there is no alias for both sums.
SELECT
    sum(a),
    sum(b)
FROM foo
Best practice
Add aliases.
SELECT
    sum(a) AS a_sum,
    sum(b) AS b_sum
FROM foo
class Rule_L014(code, description, **kwargs)

Inconsistent capitalisation of unquoted identifiers.

Configuration
capitalisation_policy: The capitalisation policy to enforce. Must be one of [‘consistent’, ‘upper’, ‘lower’, ‘capitalise’].

The functionality for this rule is inherited from Rule_L010.

class Rule_L015(code, description, **kwargs)

DISTINCT used with parentheses.

Anti-pattern
In this example, parenthesis are not needed and confuse
DISTINCT with a function. The parethesis can also be misleading
in which columns they apply to.
SELECT DISTINCT(a), b FROM foo
Best practice
Remove parenthesis to be clear that the DISTINCT applies to
both columns.
SELECT DISTINCT a, b FROM foo
class Rule_L016(code, description, **kwargs)

sqlfluff fix compatible.

Line is too long

Configuration
max_line_length: The maximum length of a line to allow without raising a violation. Must be one of range(0, 1000).

tab_space_size: The number of spaces to consider equal to one tab. Used in the fixing step of this rule. Must be one of range(0, 100).

indent_unit: Whether to use tabs or spaces to add new indents. Must be one of [‘space’, ‘tab’].

class Rule_L017(code, description, **kwargs)

sqlfluff fix compatible.

Function name not immediately followed by bracket.

Anti-pattern
In this example, there is a space between the function and the parenthesis.
SELECT
    sum (a)
FROM foo
Best practice
Remove the space between the function and the parenthesis.
SELECT
    sum(a)
FROM foo
class Rule_L018(code, description, **kwargs)

sqlfluff fix compatible.

WITH clause closing bracket should be aligned with WITH keyword.

Anti-pattern
The • character represents a space.
In this example, the closing bracket is not aligned with WITH keyword.
WITH zoo AS (
    SELECT a FROM foo
••••)

SELECT * FROM zoo
Best practice
Remove the spaces to align the WITH keyword with the closing bracket.
WITH zoo AS (
    SELECT a FROM foo
)

SELECT * FROM zoo
class Rule_L019(code, description, **kwargs)

sqlfluff fix compatible.

Leading/Trailing comma enforcement.

Configuration
comma_style: The comma style to to enforce. Must be one of [‘leading’, ‘trailing’].

Anti-pattern
There is a mixture of leading and trailing commas.
SELECT
    a
    , b,
    c
FROM foo
Best practice
By default sqlfluff prefers trailing commas, however it
is configurable for leading commas. Whichever option you chose
it does expect you to be consistent.
SELECT
    a,
    b,
    c
FROM foo

-- Alternatively, set the configuration file to 'leading'
-- and then the following would be acceptable:

SELECT
    a
    , b
    , c
FROM foo
class Rule_L020(code, description, **kwargs)

Table aliases should be unique within each clause.

class Rule_L021(code, description, **kwargs)

Ambiguous use of DISTINCT in select statement with GROUP BY.

Anti-pattern
DISTINCT and GROUP BY are conflicting.
SELECT DISTINCT
    a
FROM foo
GROUP BY a
Best practice
Remove DISTINCT or GROUP BY. In our case, removing GROUP BY is better.
SELECT DISTINCT
    a
FROM foo
class Rule_L022(code, description, **kwargs)

sqlfluff fix compatible.

Blank line expected but not found after CTE definition.

Anti-pattern
There is no blank line after the CTE. In queries with many
CTEs this hinders readability.
WITH plop AS (
    SELECT * FROM foo
)
SELECT a FROM plop
Best practice
Add a blank line.
WITH plop AS (
    SELECT * FROM foo
)

SELECT a FROM plop
class Rule_L023(code, description, **kwargs)

sqlfluff fix compatible.

Single whitespace expected after AS in WITH clause.

Anti-pattern
WITH plop AS(
    SELECT * FROM foo
)

SELECT a FROM plop
Best practice
The • character represents a space.
Add a space after AS, to avoid confusing
it for a function.
WITH plop AS•(
    SELECT * FROM foo
)

SELECT a FROM plop
class Rule_L024(code, description, **kwargs)

Single whitespace expected after USING in JOIN clause.

Anti-pattern
SELECT b
FROM foo
LEFT JOIN zoo USING(a)
Best practice
The • character represents a space.
Add a space after USING, to avoid confusing it
for a function.
SELECT b
FROM foo
LEFT JOIN zoo USING•(a)
class Rule_L025(code, description, **kwargs)

Tables should not be aliased if that alias is not used.

Anti-pattern
SELECT
    a
FROM foo AS zoo
Best practice
Use the alias or remove it. An usused alias makes code
harder to read without changing any functionality.
SELECT
    zoo.a
FROM foo AS zoo

-- Alternatively...

SELECT
    a
FROM foo
class Rule_L026(code, description, **kwargs)

References cannot reference objects not present in FROM clause.

Anti-pattern
In this example, the reference ‘vee’ has not been declared.
SELECT
    vee.a
FROM foo
Best practice
Remove the reference.
SELECT
    a
FROM foo
class Rule_L027(code, description, **kwargs)

References should be qualified if select has more than one referenced table/view.

NB: Except if they’re present in a USING clause.

Anti-pattern
In this example, the reference ‘vee’ has not been declared
and the variables ‘a’ and ‘b’ are potentially ambiguous.
SELECT a, b
FROM foo
LEFT JOIN vee ON vee.a = foo.a
Best practice
Add the references.
SELECT foo.a, vee.b
FROM foo
LEFT JOIN vee ON vee.a = foo.a
class Rule_L028(code, description, **kwargs)

References should be consistent in statements with a single table.

Configuration
single_table_references: The expectation for references in single-table select. Must be one of [‘consistent’, ‘qualified’, ‘unqualified’].

Anti-pattern
In this example, only the field b is referenced.
SELECT
    a,
    foo.b
FROM foo
Best practice
Remove all the reference or reference all the fields.
SELECT
    a,
    b
FROM foo

-- Also good

SELECT
    foo.a,
    foo.b
FROM foo
class Rule_L029(code, description, **kwargs)

Keywords should not be used as identifiers.

Configuration
only_aliases: Whether or not to flags violations for only alias expressions or all unquoted identifiers. Must be one of [True, False].

Anti-pattern
In this example, SUM function is used as an alias.
SELECT
    sum.a
FROM foo AS sum
Best practice
Avoid keywords as the name of an alias.
SELECT
    vee.a
FROM foo AS vee
class Rule_L030(code, description, **kwargs)

Inconsistent capitalisation of function names.

Configuration
capitalisation_policy: The capitalisation policy to enforce. Must be one of [‘consistent’, ‘upper’, ‘lower’, ‘capitalise’].

The functionality for this rule is inherited from Rule_L010.

Anti-pattern
In this example, the two SUM functions don’t have the same capitalisation.
SELECT
    sum(a) AS aa,
    SUM(b) AS bb
FROM foo
Best practice
Make the case consistent.
SELECT
    sum(a) AS aa,
    sum(b) AS bb
FROM foo
class Rule_L031(code, description, **kwargs)

sqlfluff fix compatible.

Avoid table aliases in from clauses and join conditions.

Anti-pattern
In this example, alias ‘o’ is used for the orders table, and ‘c’ is used for ‘customers’ table.
SELECT
    COUNT(o.customer_id) as order_amount,
    c.name
FROM orders as o
JOIN customers as c on o.id = c.user_id
Best practice
Avoid aliases.
SELECT
    COUNT(orders.customer_id) as order_amount,
    customers.name
FROM orders
JOIN customers on orders.id = customers.user_id

-- Self-join will not raise issue

SELECT
    table.a,
    table_alias.b,
FROM
    table
    LEFT JOIN table AS table_alias ON table.foreign_key = table_alias.foreign_key
class Rule_L032(code, description, **kwargs)

Prefer specifying join keys instead of using “USING”.

Anti-pattern
SELECT
    table_a.field_1,
    table_b.field_2
FROM
    table_a
INNER JOIN table_b USING (id)
Best practice
Specify the keys directly
SELECT
    table_a.field_1,
    table_b.field_2
FROM
    table_a
INNER JOIN table_b
    ON table_a.id = table_b.id
class Rule_L033(code, description, **kwargs)

UNION ALL is preferred over UNION.

Anti-pattern
In this example, UNION ALL should be preferred over UNION
SELECT a, b FROM table_1 UNION SELECT a, b FROM table_2
Best practice
Replace UNION with UNION ALL
SELECT a, b FROM table_1 UNION ALL SELECT a, b FROM table_2
class Rule_L034(code, description, **kwargs)

sqlfluff fix compatible.

Use wildcards then simple select targets before calculations and aggregates.

Anti-pattern
select
    a,
    *,
    row_number() over (partition by id order by date) as y,
    b
from x
Best practice
Order “select” targets in ascending complexity
select
    *,
    a,
    b,
    row_number() over (partition by id order by date) as y
from x

Implementation

class RuleSet(name, config_info)

Class to define a ruleset.

A rule set is instantiated on module load, but the references to each of it’s classes are instantiated at runtime. This means that configuration values can be passed to those rules live and be responsive to any changes in configuration from the path that the file is in.

Rules should be fetched using the get_rulelist() command which also handles any filtering (i.e. whitelisting and blacklisting).

New rules should be added to the instance of this class using the register() decorator. That decorator registers the class, but also performs basic type and name-convention checks.

The code for the rule will be parsed from the name, the description from the docstring. The eval function is assumed that it will be overriden by the subclass, and the parent class raises an error on this function if not overriden.

copy()

Return a copy of self with a seperate register.

document_configuration(cls)

Add a ‘Configuration’ section to a Rule docstring.

Utilize the the metadata in config_info to dynamically document the configuration options for a given rule.

This is a little hacky, but it allows us to propogate configuration options in the docs, from a single source of truth.

document_fix_compatible(cls)

Mark the rule as fixable in the documentation.

get_rulelist(config)

Use the config to return the appropriate rules.

We use the config both for whitelisting and blacklisting, but also for configuring the rules given the given config.

Returns

list of instantiated BaseCrawler.

register(cls)

Decorate a class with this to add it to the ruleset.

@myruleset.register
class Rule_L001(BaseCrawler):
    "Description of rule."

    def eval(self, **kwargs):
        return LintResult()

We expect that rules are defined as classes with the name Rule_XXXX where XXXX is of the form LNNN, where L is a letter (literally L for linting by default) and N is a three digit number.

If this receives classes by any other name, then it will raise a ValueError.

class BaseCrawler(code, description, **kwargs)

The base class for a crawler, of which all rules are derived from.

Parameters
  • code (str) – The identifier for this rule, used in inclusion or exclusion.

  • description (str) – A human readable description of what this rule does. It will be displayed when any violations are found.

_eval(**kwargs)

Evaluate this rule against the current context.

This should indicate whether a linting violation has occured and/or whether there is something to remember from this evaluation.

Note that an evaluate function shoul always accept **kwargs, but if it relies on any available kwargs, it should explicitly call them out at definition.

Returns

LintResult or None.

The reason that this method is called _eval() and not eval is a bit of a hack with sphinx autodoc, to make it so that the rule documentation auto-generates nicely.

crawl(segment, dialect, parent_stack=None, siblings_pre=None, siblings_post=None, raw_stack=None, fix=False, memory=None)

Recursively perform the crawl operation on a given segment.

Returns

A tuple of (vs, raw_stack, fixes, memory)

static filter_meta(segments, keep_meta=False)

Filter the segments to non-meta.

Or optionally the opposite if keep_meta is True.

classmethod get_parent_of(segment, root_segment)

Return the segment immediately containing segment.

NB: This is recursive.

Parameters
  • segment – The segment to look for.

  • root_segment – Some known parent of the segment we’re looking for (although likely not the direct parent in question).

classmethod make_keyword(raw, pos_marker)

Make a keyword segment.

classmethod make_newline(pos_marker, raw=None)

Make a newline segment.

classmethod make_whitespace(raw, pos_marker)

Make a whitespace segment.

class LintResult(anchor=None, fixes=None, memory=None, description=None)

A class to hold the results of a crawl operation.

Parameters
  • anchor (BaseSegment, optional) – A segment which represents the position of the a problem. NB: Each fix will also hold it’s own reference to position, so this position is mostly for alterting the user to where the problem is.

  • fixes (list of LintFix, optional) – An array of any fixes which would correct this issue. If not present then it’s assumed that this issue will have to manually fixed.

  • memory (dict, optional) – An object which stores any working memory for the crawler. The memory returned in any LintResult will be passed as an input to the next segment to be crawled.

  • description (str, optional) – A description of the problem identified as part of this result. This will override the description of the rule as what gets reported to the user with the problem if provided.

to_linting_error(rule)

Convert a linting result to a SQLLintError if appropriate.

class LintFix(edit_type, anchor, edit=None)

A class to hold a potential fix to a linting violation.

Parameters
  • edit_type (str) – One of create, edit, delete to indicate the kind of fix this represents.

  • anchor (BaseSegment) – A segment which represents the position that this fix should be applied at. For deletions it represents the segment to delete, for creations it implies the position to create at (with the existing element at this position to be moved after the edit), for an edit it implies the segment to be replaced.

  • edit (BaseSegment, optional) – For edit and create fixes, this hold the segment, or iterable of segments to create to replace at the given anchor point.

is_trivial()

Return true if the fix is trivial.

Trivial edits are: - Anything of zero length. - Any edits which result in themselves.

Removing these makes the routines which process fixes much faster.

The _eval function of each rule should take enough arguments that it can evaluate the position of the given segment in relation to it’s neighbors, and that the segment which finally “triggers” the error, should be the one that would be corrected OR if the rule relates to something that is missing, then it should flag on the segment FOLLOWING, the place that the desired element is missing.

Inline Ignoring Errors

sqlfluff features inline error ignoring. For example, the following will ignore the lack of whitespace surrounding the * operator.

a.a*a.b AS bad_1  -- noqa: L006

Multiple rules can be ignored by placing them in a comma-delimited list.

a.a *  a.b AS bad_2,  -- noqa: L007, L006

It is also possible to ignore non-rule based errors, and instead opt to ignore templating (TMP) & parsing (PRS) errors.

WHERE dt >= DATE_ADD(CURRENT_DATE(), INTERVAL -2 DAY) -- noqa: PRS

Should the need arise, not specifying specific rules to ignore will ignore all rules on the given line.

a.a*a.b AS bad_3  -- noqa