Python API¶

SQLFluff exposes a public api for other python applications to use. A basic example of this usage is given here, with the documentation for each of the methods below.

"""This is an example of how to use the simple sqlfluff api."""

from typing import Any, Iterator, Union

import sqlfluff

#  -------- LINTING ----------

my_bad_query = "SeLEct  *, 1, blah as  fOO  from mySchema.myTable"

# Lint the given string and return an array of violations in JSON representation.
lint_result = sqlfluff.lint(my_bad_query, dialect="bigquery")
# lint_result =
# [
#     {
#         "code": "CP01",
#         "line_no": 1,
#         "line_pos": 1,
#         "description": "Keywords must be consistently upper case.",
#     }
#     ...
# ]

#  -------- FIXING ----------

# Fix the given string and get a string back which has been fixed.
fix_result_1 = sqlfluff.fix(my_bad_query, dialect="bigquery")
# fix_result_1 = 'SELECT  *, 1, blah AS  foo  FROM myschema.mytable\n'

# We can also fix just specific rules.
fix_result_2 = sqlfluff.fix(my_bad_query, rules=["CP01"])
# fix_result_2 = 'SELECT  *, 1, blah AS  fOO  FROM mySchema.myTable'

# Or a subset of rules...
fix_result_3 = sqlfluff.fix(my_bad_query, rules=["CP01", "CP02"])
# fix_result_3 = 'SELECT  *, 1, blah AS  fOO  FROM myschema.mytable'

#  -------- PARSING ----------

# Parse the given string and return a JSON representation of the parsed tree.
parse_result = sqlfluff.parse(my_bad_query)
# parse_result = {'file': {'statement': {...}, 'newline': '\n'}}

# This JSON structure can then be parsed as required.
# An example usage is shown below:


def get_json_segment(
    parse_result: dict[str, Any], segment_type: str
) -> Iterator[Union[str, dict[str, Any], list[dict[str, Any]]]]:
    """Recursively search JSON parse result for specified segment type.

    Args:
        parse_result (Dict[str, Any]): JSON parse result from `sqlfluff.fix`.
        segment_type (str): The segment type to search for.

    Yields:
        Iterator[Union[str, dict[str, Any], list[dict[str, Any]]]]:
        Retrieves children of specified segment type as either a string for a raw
        segment or as JSON or an array of JSON for non-raw segments.
    """
    for k, v in parse_result.items():
        if k == segment_type:
            yield v
        elif isinstance(v, dict):
            yield from get_json_segment(v, segment_type)
        elif isinstance(v, list):
            for s in v:
                yield from get_json_segment(s, segment_type)


# e.g. Retrieve array of JSON for table references.
table_references = list(get_json_segment(parse_result, "table_reference"))
print(table_references)
# [[{'identifier': 'mySchema'}, {'dot': '.'}, {'identifier': 'myTable'}]]

# Retrieve raw table name from last identifier in the table reference.
for table_reference in table_references:
    table_name = list(get_json_segment(parse_result, "naked_identifier"))[-1]
    print(f"table_name: {table_name}")
# table_name: myTable

Simple API commands¶

Sqlfluff is a SQL linter for humans.

Fix a SQL string.

Parameters:

sql (str) – The SQL to be fixed.
dialect (Optional[str], optional) – A reference to the dialect of the SQL to be fixed. Defaults to ansi.
rules (Optional[list[str], optional) – A subset of rule references to fix for. Defaults to None.
exclude_rules (Optional[list[str], optional) – A subset of rule references to avoid fixing for. Defaults to None.
config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.
config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.
fix_even_unparsable (bool, optional) – Optional override for the corresponding SQLFluff configuration value.

Returns:

str for the fixed SQL if possible.

Lint a SQL string.

Parameters:

sql (str) – The SQL to be linted.
dialect (Optional[str], optional) – A reference to the dialect of the SQL to be linted. Defaults to ansi.
rules (Optional[list[str], optional) – A list of rule references to lint for. Defaults to None.
exclude_rules (Optional[list[str], optional) – A list of rule references to avoid linting for. Defaults to None.
config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.
config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

Returns:

list[dict[str, Any]] for each violation found.

parse(sql: str, dialect: str | None = None, config: FluffConfig | None = None, config_path: str | None = None) → dict[str, Any]¶

Parse a SQL string.

Parameters:

sql (str) – The SQL to be parsed.
dialect (Optional[str], optional) – A reference to the dialect of the SQL to be parsed. Defaults to ansi.
config (Optional[FluffConfig], optional) – A configuration object to use for the operation. Defaults to None.
config_path (Optional[str], optional) – A path to a .sqlfluff config, which is only used if a config is not already provided. Defaults to None.

Returns:

Dict[str, Any] JSON containing the parsed structure.

Note

In the case of multiple potential variants from the raw source file only the first variant is returned by the simple API. For access to the other variants, use the underlying main API directly.

Advanced API usage¶

The simple API presents only a fraction of the functionality present within the core SQLFluff library. For more advanced use cases, users can import the Linter() and FluffConfig() classes from sqlfluff.core. As of version 0.4.0 this is considered as experimental only as the internals may change without warning in any future release. If you come to rely on the internals of SQLFluff, please post an issue on GitHub to share what you’re up to. This will help shape a more reliable, tidy and well documented public API for use.

Configuring SQLFluff¶

You can use FluffConfig() class to configure SQLFluff behaviour.

"""This is an example of providing config overrides."""

from sqlfluff.core import FluffConfig, Linter

sql = "SELECT 1\n"


config = FluffConfig(
    overrides={
        "dialect": "snowflake",
        # NOTE: We explicitly set the string "none" here rather
        # than a None literal so that it overrides any config
        # set by any config files in the path.
        "library_path": "none",
    }
)

linted_file = Linter(config=config).lint_string(sql)

assert linted_file.get_violations() == []

Instances of FluffConfig() can be created manually, or parsed.

"""An example to show a few ways of configuring the API."""

import sqlfluff
from sqlfluff.core import FluffConfig, Linter

# #######################################
# The simple API can be configured in three ways.

# 1. Limited keyword arguments
sqlfluff.fix("SELECT  1", dialect="bigquery")

# 2. Providing the path to a config file
sqlfluff.fix("SELECT  1", config_path="test/fixtures/.sqlfluff")

# 3. Providing a preconfigured FluffConfig object.
# NOTE: This is the way of configuring SQLFluff which will give the most control.

# 3a. FluffConfig objects can be created directly from a dictionary of values.
config = FluffConfig(configs={"core": {"dialect": "bigquery"}})
# 3b. FluffConfig objects can be created from a config file in a string.
config = FluffConfig.from_string("[sqlfluff]\ndialect=bigquery\n")
# 3c. FluffConfig objects can be created from a config file in multiple strings
#     to simulate the effect of multiple nested config strings.
config = FluffConfig.from_strings(
    # NOTE: Given these two strings, the resulting dialect would be "mysql"
    # as the later files take precedence.
    "[sqlfluff]\ndialect=bigquery\n",
    "[sqlfluff]\ndialect=mysql\n",
)
# 3d. FluffConfig objects can be created from a path containing a config file.
config = FluffConfig.from_path("test/fixtures/")
# 3e. FluffConfig objects can be from keyword arguments
config = FluffConfig.from_kwargs(dialect="bigquery", rules=["LT01"])

# The FluffConfig is then provided via a config argument.
sqlfluff.fix("SELECT  1", config=config)


# #######################################
# The core API is always configured using a FluffConfig object.

# When instantiating a Linter (or Parser), a FluffConfig must be provided
# on instantiation. See above for details on how to create a FluffConfig.
linter = Linter(config=config)

# The provided config will then be used in any operations.

lint_result = linter.lint_string("SELECT  1", fix=True)
fixed_string = lint_result.fix_string()
# NOTE: The "True" element shows that fixing was a success.
assert fixed_string == ("SELECT 1", True)

Supported dialects and rules are available through list_dialects() and list_rules().

"""This is an example of how get basic options from sqlfluff."""

import sqlfluff

#  -------- DIALECTS ----------

dialects = sqlfluff.list_dialects()
# dialects = [DialectTuple(label='ansi', name='ansi', inherits_from='nothing'), ...]
dialect_names = [dialect.label for dialect in dialects]
# dialect_names = ["ansi", "snowflake", ...]


#  -------- RULES ----------

rules = sqlfluff.list_rules()
# rules = [
#     RuleTuple(
#         code='Example_LT01',
#         description='ORDER BY on these columns is forbidden!'
#     ),
#     ...
# ]
rule_codes = [rule.code for rule in rules]
# rule_codes = ["LT01", "LT02", ...]

Advanced API reference¶

The core elements of sqlfluff.

Lexer¶: alias of PyLexer

The interface class to interact with the linter.

classmethod allowed_rule_ref_map(reference_map: dict[str, set[str]], disable_noqa_except: str | None) → dict[str, set[str]]¶: Generate a noqa rule reference map.

fix(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) → tuple[BaseSegment, list[SQLBaseError]]¶: Return the fixed tree and violations from lintfix when we’re fixing.

get_rulepack(config: FluffConfig | None = None) → RulePack¶: Get hold of a set of rules.

lint(tree: BaseSegment, config: FluffConfig | None = None, fname: str | None = None, templated_file: TemplatedFile | None = None) → list[SQLBaseError]¶: Return just the violations from lintfix when we’re only linting.

classmethod lint_fix_parsed(tree: BaseSegment, config: FluffConfig, rule_pack: RulePack, fix: bool = False, fname: str | None = None, templated_file: TemplatedFile | None = None, formatter: FormatterInterface | None = None) → tuple[BaseSegment, list[SQLBaseError], IgnoreMask | None, list[tuple[str, str, float]]]¶: Lint and optionally fix a tree object.

classmethod lint_parsed(parsed: ParsedString, rule_pack: RulePack, fix: bool = False, formatter: FormatterInterface | None = None, encoding: str = 'utf8') → LintedFile¶: Lint a ParsedString and return a LintedFile.

lint_path(path: str, fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None) → LintedDir¶: Lint a path.

lint_paths(paths: tuple[str, ...], fix: bool = False, ignore_non_existent_files: bool = False, ignore_files: bool = True, processes: int | None = None, apply_fixes: bool = False, fixed_file_suffix: str = '', fix_even_unparsable: bool = False, retain_files: bool = True) → LintingResult¶: Lint an iterable of paths.

classmethod lint_rendered(rendered: RenderedFile, rule_pack: RulePack, fix: bool = False, formatter: FormatterInterface | None = None) → LintedFile¶: Take a RenderedFile and return a LintedFile.

lint_string(in_str: str = '', fname: str = '<string input>', fix: bool = False, config: FluffConfig | None = None, encoding: str = 'utf8') → LintedFile¶

Lint a string.

Returns:: an object representing that linted file.
Return type:: LintedFile

lint_string_wrapped(string: str, fname: str = '<string input>', fix: bool = False, stdin_filename: str | None = None) → LintingResult¶: Lint strings directly.

static load_raw_file_and_config(fname: str, root_config: FluffConfig) → tuple[str, FluffConfig, str]¶: Load a raw file and the associated config.

parse_path(path: str, parse_statistics: bool = False) → Iterator[ParsedString]¶

Parse a path of sql files.

NB: This a generator which will yield the result of each file within the path iteratively.

classmethod parse_rendered(rendered: RenderedFile, parse_statistics: bool = False) → ParsedString¶: Parse a rendered file.

parse_string(in_str: str, fname: str = '<string>', config: FluffConfig | None = None, encoding: str = 'utf-8', parse_statistics: bool = False) → ParsedString¶: Parse a string.

static remove_templated_errors(linting_errors: list[SQLBaseError]) → list[SQLBaseError]¶: Filter a list of lint errors, removing those from the templated slices.

render_file(fname: str, root_config: FluffConfig) → RenderedFile¶: Load and render a file with relevant config.

render_string(in_str: str, fname: str, config: FluffConfig, encoding: str) → RenderedFile¶: Template the file.

rule_tuples() → list[RuleTuple]¶: A simple pass through to access the rule tuples of the rule set.

class Parser(config: FluffConfig | None = None, dialect: str | None = None)¶

Instantiates parsed queries from a sequence of lexed raw segments.

parse(segments: Sequence[BaseSegment], fname: str | None = None, parse_statistics: bool = False) → BaseSegment | None¶: Parse a series of lexed tokens using the current dialect.

Python API¶

Simple API commands¶

Advanced API usage¶

Configuring SQLFluff¶

Advanced API reference¶

Navigation

Related Topics