Production Usage & Security¶
SQLFluff is designed to be used both as a utility for developers but also to be part of CI/CD pipelines.
Security Considerations¶
A full list of Security Advisories is available on GitHub.
Given the context of how SQLFluff is designed to be used, there are three different tiers of access which users may have access to manipulate how the tool functions in a secure environment.
Users may have edit access to the SQL code which is being linted. While SQLFluff does not execute the SQL itself, in the process of the templating step (in particular via jinja or dbt), certain macros may have the ability to execute arbitrary SQL code (e.g. the dbt run_query macro). For the Jinja templater, SQLFluff uses the Jinja2 SandboxedEnvironment to limit the execution on unsafe code. When looking to further secure this situation, see below for ways to limit the ability of users to import other libraries.
Users may have edit access to the SQLFluff :ref:`config-files`. In some (perhaps, many) environments, the users who can edit SQL files may also be able to access and edit the Configuration Files. It’s important to note that because of In-File Configuration Directives, that users who can edit SQL files which are designed to be linted, will also have access to the vast majority of any configuration options available in Configuration Files. This means that there is minimal additional protection from restricting access to Configuration Files for users who already have access to edit the linting target files (as described above).
Users may have access to change how SQLFluff is invoked. SQLFluff can be invoked either as a command line too or via the python API. Typically the method is fixed for a given application. When thinking about how to restrict the ability of users to call insecure code, SQLFluff aims to provide options at the point of invocation. In particular, as described above, the primary risk vector for SQLFluff is the macro environment as described in Templating Configuration. To restrict users being able to bring arbitrary python methods into sqlfluff via the
library_path
configuration value (see Library Templating), we recommend that for secure environments you override this config value either by providing anoverride
option to theFluffConfig
object if using the Python API or via the--library-path
CLI option:To disable this option entirely via the CLI:
$ sqlfluff lint my_path --library-path none
To disable this option entirely via the python API:
"""This is an example of providing config overrides.""" from sqlfluff.core import FluffConfig, Linter sql = "SELECT 1\n" config = FluffConfig( overrides={ "dialect": "snowflake", # NOTE: We explicitly set the string "none" here rather # than a None literal so that it overrides any config # set by any config files in the path. "library_path": "none", } ) linted_file = Linter(config=config).lint_string(sql) assert linted_file.get_violations() == []
Using SQLFluff on a whole sql codebase¶
The exit code provided by SQLFluff when run as a command line utility is designed to assist usefulness in deployment pipelines. If no violations are found then the exit code will be 0. If violations are found then a non-zero code will be returned which can be interrogated to find out more.
An error code of
0
means operation success, no issues found.An error code of
1
means operation success, issues found. For example this might mean that a linting issue was found, or that one file could not be parsed.An error code of
2
means an error occurred and the operation could not be completed. For example a configuration issue or an internal error within SQLFluff.
Using SQLFluff on changes using diff-quality
¶
For projects with large amounts of (potentially imperfect) SQL code, the full SQLFluff output could be very large, which can be distracting – perhaps the CI build for a one-line SQL change shouldn’t encourage the developer to fix lots of unrelated quality issues.
To support this use case, SQLFluff integrates with a quality checking tool
called diff-quality
. By running SQLFluff using diff-quality
(rather
than running it directly), you can limit the the output to the new or modified
SQL in the branch (aka pull request or PR) containing the proposed changes.
Currently, diff-quality
requires that you are using git
for version
control.
NOTE: Installing SQLFluff automatically installs the diff_cover
package
that provides the diff-quality
tool.
Adding diff-quality
to your builds¶
In your CI build script:
1. Set the current working directory to the git
repository containing the
SQL code to be checked.
Run
diff-quality
, specifying SQLFluff as the underlying tool:
$ diff-quality --violations sqlfluff
The output will look something like:
-------------
Diff Quality
Quality Report: sqlfluff
Diff: origin/master...HEAD, staged and unstaged changes
-------------
sql/audience_size_queries/constraints/_postcondition_check_gdpr_compliance.sql (0.0%):
sql/audience_size_queries/constraints/_postcondition_check_gdpr_compliance.sql:5: Unquoted Identifiers must be consistently upper case.
-------------
Total: 1 line
Violations: 1 line
% Quality: 0%
-------------
These messages are basically the same as those provided directly by SQLFluff,
although the format is a little different. Note that diff-quality
only lists
the line _numbers_, not the character position. If you need the character
position, you will need to run SQLFluff directly.
For more information on diff-quality
, see the
documentation. It covers topics
such as:
Generating HTML reports
Controlling which branch to compare against (i.e. to determine new/changed lines). The default is origin/master.
Configuring
diff-quality
to return an error code if the quality is too low.Troubleshooting
Using pre-commit¶
pre-commit is a framework to manage git “hooks” triggered right before a commit is made.
A git hook is a git feature to “fire off custom scripts” when specific actions occur.
Using pre-commit with SQLFluff is a good way to provide automated linting to SQL developers.
With pre-commit, you also get the benefit of only linting/fixing the files that changed.
SQLFluff comes with two pre-commit hooks:
sqlfluff-lint: returns linting errors.
sqlfluff-fix: attempts to fix rule violations.
Warning
For safety reasons, sqlfluff-fix
by default will not make any fixes in
files that had templating or parse errors, even if those errors were ignored
using noqa
or –ignore`.
Although it is not advised, you can tell SQLFluff to try and fix
these files by overriding the fix_even_unparsable
setting
in .sqlfluff
config file or using the sqlfluff fix --FIX-EVEN-UNPARSABLE
command line option.
Overriding this behavior may break your SQL. If you use this override, always be sure to review any fixes applied to files with templating or parse errors to verify they are okay.
You should create a file named .pre-commit-config.yaml at the root of your git project, which should look like this:
repos:
- repo: https://github.com/sqlfluff/sqlfluff
rev: stable_version
hooks:
- id: sqlfluff-lint
# For dbt projects, this installs the dbt "extras".
# You will need to select the relevant dbt adapter for your dialect
# (https://docs.getdbt.com/docs/available-adapters):
# additional_dependencies: ['<dbt-adapter>', 'sqlfluff-templater-dbt']
- id: sqlfluff-fix
# Arbitrary arguments to show an example
# args: [--rules, "LT02,CP02"]
# additional_dependencies: ['<dbt-adapter>', 'sqlfluff-templater-dbt']
When trying to use the dbt templater, uncomment the
additional_dependencies
to install the extras.
This is equivalent to running pip install <dbt-adapter> sqlfluff-templater-dbt
.
You can specify the version of dbt-adapter
used in pre-commit,
for example:
additional_dependencies : ['dbt-bigquery==1.0.0', 'sqlfluff-templater-dbt']
See the list of available dbt-adapters.
Note that you can pass the same arguments available
through the CLI using args:
.
Using GitHub Actions to Annotate PRs¶
There are two way to utilize SQLFluff to annotate Github PRs.
When
sqlfluff lint
is run with the--format github-annotation-native
option, it produces output formatted as Github workflow commands which are converted into pull request annotations by Github.When
sqlfluff lint
is run with the--format github-annotation
option, it produces output compatible with this action from yuzutech. Which uses Github API to annotate the SQL in GitHub pull requests.
Warning
At present (December 2023), limitations put in place by Github mean that only the
first 10 annotations will be displayed if the first option (using
github-annotation-native
) is used. This is a not something that SQLFluff
can control itself and so we currently recommend using the the second option
above and the action from yuzutech.
There is an open feature request for GitHub Actions which you can track to follow this issue.
For more information and examples on using SQLFluff in GitHub Actions, see the sqlfluff-github-actions repository.