From dce79f5c8e1bb26659dfd89654ddc309d8bfa9e8 Mon Sep 17 00:00:00 2001 From: "W. J. van der Laan" Date: Wed, 5 May 2021 16:59:45 +0200 Subject: [PATCH] Merge bitcoin/bitcoin#21740: test: add new python linter to check file names and permissions 46b025e00df40724175735eb5606ac73067cb3b8 test: add new python linter to check file names and permissions (windsok) 6f6bb3ebc7cb8e17a5dfc8ef55aa2d3f2dc6bdea test: fix file permissions on various scripts (windsok) Pull request description: Adds a new python linter test which tests for correct filenames and file permissions in the repository. Replaces the existing tests in the `test/lint/lint-filenames.sh` and `test/lint/lint-shebang.sh` linter tests, as well as adding some new and increased testing. This increased coverage is intended to catch issues such as in #21728 and https://github.com/bitcoin/bitcoin/pull/16807/files#r345547050 Summary of tests: * Checks every file in the repository against an allowed regexp to make sure only lowercase or uppercase alphanumerics (a-zA-Z0-9), underscores (_), hyphens (-), at (@) and dots (.) are used in repository filenames. * Checks only source files (*.cpp, *.h, *.py, *.sh) against a stricter allowed regexp to make sure only lowercase alphanumerics (a-z0-9), underscores (_), hyphens (-) and dots (.) are used in source code filenames. Additionally there is an exception regexp for directories or files which are excepted from matching this regexp (This should replicate the existing `test/lint/lint-filenames.sh` test) * Checks all files in the repository match an allowed executable or non-executable file permission octal. Additionally checks that for executable files, the file contains a shebang line. * Checks that for executable `.py` and `.sh` files, the shebang line used matches an allowable list of shebangs (This should replicate the existing `test/lint/lint-shebang.sh` test) * Checks every file that contains a shebang line to ensure it has an executable permission Additionally updates the permissions on various files to comply with the new tests. Fixes #21729 ACKs for top commit: practicalswift: cr re-ACK 46b025e00df40724175735eb5606ac73067cb3b8: patch still looks correct kiminuo: code review ACK 46b025e00df40724175735eb5606ac73067cb3b8 if `contrib/gitian-descriptors/assign_DISTNAME` permission change is deemed OK. laanwj: Code review ACK 46b025e00df40724175735eb5606ac73067cb3b8 Tree-SHA512: 1c8201a2cee0d9cbce15652b68cec9a6458a8b493fcd5392f98560aca0b1a12e668baab65a47100f116f626dadc3f591deb47f7368468c6a46c6c712c2533455 --- ci/test/00_setup_env_native_asan.sh | 0 .../00_setup_env_native_fuzz_with_valgrind.sh | 0 ci/test/00_setup_env_native_multiprocess.sh | 0 ci/test/00_setup_env_native_valgrind.sh | 0 ci/test/00_setup_env_s390x.sh | 0 contrib/guix/libexec/build.sh | 0 contrib/qos/tc.sh | 0 test/README.md | 2 +- test/lint/lint-filenames.sh | 24 -- test/lint/lint-files.py | 207 ++++++++++++++++++ test/lint/lint-files.sh | 7 + test/lint/lint-shebang.sh | 24 -- 12 files changed, 215 insertions(+), 49 deletions(-) mode change 100644 => 100755 ci/test/00_setup_env_native_asan.sh mode change 100644 => 100755 ci/test/00_setup_env_native_fuzz_with_valgrind.sh mode change 100644 => 100755 ci/test/00_setup_env_native_multiprocess.sh mode change 100644 => 100755 ci/test/00_setup_env_native_valgrind.sh mode change 100644 => 100755 ci/test/00_setup_env_s390x.sh mode change 100644 => 100755 contrib/guix/libexec/build.sh mode change 100644 => 100755 contrib/qos/tc.sh delete mode 100755 test/lint/lint-filenames.sh create mode 100755 test/lint/lint-files.py create mode 100755 test/lint/lint-files.sh delete mode 100755 test/lint/lint-shebang.sh diff --git a/ci/test/00_setup_env_native_asan.sh b/ci/test/00_setup_env_native_asan.sh old mode 100644 new mode 100755 diff --git a/ci/test/00_setup_env_native_fuzz_with_valgrind.sh b/ci/test/00_setup_env_native_fuzz_with_valgrind.sh old mode 100644 new mode 100755 diff --git a/ci/test/00_setup_env_native_multiprocess.sh b/ci/test/00_setup_env_native_multiprocess.sh old mode 100644 new mode 100755 diff --git a/ci/test/00_setup_env_native_valgrind.sh b/ci/test/00_setup_env_native_valgrind.sh old mode 100644 new mode 100755 diff --git a/ci/test/00_setup_env_s390x.sh b/ci/test/00_setup_env_s390x.sh old mode 100644 new mode 100755 diff --git a/contrib/guix/libexec/build.sh b/contrib/guix/libexec/build.sh old mode 100644 new mode 100755 diff --git a/contrib/qos/tc.sh b/contrib/qos/tc.sh old mode 100644 new mode 100755 diff --git a/test/README.md b/test/README.md index c2ec80f07d..f2b0c34141 100644 --- a/test/README.md +++ b/test/README.md @@ -319,7 +319,7 @@ Please be aware that on Linux distributions all dependencies are usually availab Individual tests can be run by directly calling the test script, e.g.: ``` -test/lint/lint-filenames.sh +test/lint/lint-files.sh ``` You can run all the shell-based lint tests by running: diff --git a/test/lint/lint-filenames.sh b/test/lint/lint-filenames.sh deleted file mode 100755 index fdbf7eab39..0000000000 --- a/test/lint/lint-filenames.sh +++ /dev/null @@ -1,24 +0,0 @@ -#!/usr/bin/env bash -# -# Copyright (c) 2018-2019 The Bitcoin Core developers -# Distributed under the MIT software license, see the accompanying -# file COPYING or http://www.opensource.org/licenses/mit-license.php. -# -# Make sure only lowercase alphanumerics (a-z0-9), underscores (_), -# hyphens (-) and dots (.) are used in source code filenames. - -export LC_ALL=C - -EXIT_CODE=0 -OUTPUT=$(git ls-files --full-name -- "*.[cC][pP][pP]" "*.[hH]" "*.[pP][yY]" "*.[sS][hH]" | \ - grep -vE '^[a-z0-9_./-]+$' | \ - grep -vE '^src/(dashbls/|immer/|secp256k1/|univalue/|test/fuzz/FuzzedDataProvider.h)') - -if [[ ${OUTPUT} != "" ]]; then - echo "Use only lowercase alphanumerics (a-z0-9), underscores (_), hyphens (-) and dots (.)" - echo "in source code filenames:" - echo - echo "${OUTPUT}" - EXIT_CODE=1 -fi -exit ${EXIT_CODE} diff --git a/test/lint/lint-files.py b/test/lint/lint-files.py new file mode 100755 index 0000000000..43ca882442 --- /dev/null +++ b/test/lint/lint-files.py @@ -0,0 +1,207 @@ +#!/usr/bin/env python3 +# Copyright (c) 2021 The Bitcoin Core developers +# Distributed under the MIT software license, see the accompanying +# file COPYING or http://www.opensource.org/licenses/mit-license.php. + +""" +This checks that all files in the repository have correct filenames and permissions +""" + +import os +import re +import sys +from subprocess import check_output +from typing import Optional, NoReturn + +CMD_ALL_FILES = "git ls-files --full-name" +CMD_SOURCE_FILES = 'git ls-files --full-name -- "*.[cC][pP][pP]" "*.[hH]" "*.[pP][yY]" "*.[sS][hH]"' +CMD_SHEBANG_FILES = "git grep --full-name --line-number -I '^#!'" +ALLOWED_FILENAME_REGEXP = "^[a-zA-Z0-9/_.@][a-zA-Z0-9/_.@-]*$" +ALLOWED_SOURCE_FILENAME_REGEXP = "^[a-z0-9_./-]+$" +ALLOWED_SOURCE_FILENAME_EXCEPTION_REGEXP = ( + "^src/(secp256k1/|univalue/|test/fuzz/FuzzedDataProvider.h)" +) +ALLOWED_PERMISSION_NON_EXECUTABLES = 644 +ALLOWED_PERMISSION_EXECUTABLES = 755 +ALLOWED_EXECUTABLE_SHEBANG = { + "py": [b"#!/usr/bin/env python3"], + "sh": [b"#!/usr/bin/env bash", b"#!/bin/sh"], +} + + +class FileMeta(object): + def __init__(self, file_path: str): + self.file_path = file_path + + @property + def extension(self) -> Optional[str]: + """ + Returns the file extension for a given filename string. + eg: + 'ci/lint_run_all.sh' -> 'sh' + 'ci/retry/retry' -> None + 'contrib/devtools/split-debug.sh.in' -> 'in' + """ + return str(os.path.splitext(self.file_path)[1].strip(".") or None) + + @property + def full_extension(self) -> Optional[str]: + """ + Returns the full file extension for a given filename string. + eg: + 'ci/lint_run_all.sh' -> 'sh' + 'ci/retry/retry' -> None + 'contrib/devtools/split-debug.sh.in' -> 'sh.in' + """ + filename_parts = self.file_path.split(os.extsep, 1) + try: + return filename_parts[1] + except IndexError: + return None + + @property + def permissions(self) -> int: + """ + Returns the octal file permission of the file + """ + return int(oct(os.stat(self.file_path).st_mode)[-3:]) + + +def check_all_filenames() -> int: + """ + Checks every file in the repository against an allowed regexp to make sure only lowercase or uppercase + alphanumerics (a-zA-Z0-9), underscores (_), hyphens (-), at (@) and dots (.) are used in repository filenames. + """ + # We avoid using rstrip() to ensure we catch filenames which accidentally include trailing whitespace + filenames = check_output(CMD_ALL_FILES, shell=True).decode("utf8").split("\n") + filenames = [filename for filename in filenames if filename != ""] # removes the trailing empty list element + + filename_regex = re.compile(ALLOWED_FILENAME_REGEXP) + failed_tests = 0 + for filename in filenames: + if not filename_regex.match(filename): + print( + f"""File "{filename}" does not not match the allowed filename regexp ('{ALLOWED_FILENAME_REGEXP}').""" + ) + failed_tests += 1 + return failed_tests + + +def check_source_filenames() -> int: + """ + Checks only source files (*.cpp, *.h, *.py, *.sh) against a stricter allowed regexp to make sure only lowercase + alphanumerics (a-z0-9), underscores (_), hyphens (-) and dots (.) are used in source code filenames. + + Additionally there is an exception regexp for directories or files which are excepted from matching this regexp. + """ + # We avoid using rstrip() to ensure we catch filenames which accidentally include trailing whitespace + filenames = check_output(CMD_SOURCE_FILES, shell=True).decode("utf8").split("\n") + filenames = [filename for filename in filenames if filename != ""] # removes the trailing empty list element + + filename_regex = re.compile(ALLOWED_SOURCE_FILENAME_REGEXP) + filename_exception_regex = re.compile(ALLOWED_SOURCE_FILENAME_EXCEPTION_REGEXP) + failed_tests = 0 + for filename in filenames: + if not filename_regex.match(filename) and not filename_exception_regex.match(filename): + print( + f"""File "{filename}" does not not match the allowed source filename regexp ('{ALLOWED_SOURCE_FILENAME_REGEXP}'), or the exception regexp ({ALLOWED_SOURCE_FILENAME_EXCEPTION_REGEXP}).""" + ) + failed_tests += 1 + return failed_tests + + +def check_all_file_permissions() -> int: + """ + Checks all files in the repository match an allowed executable or non-executable file permission octal. + + Additionally checks that for executable files, the file contains a shebang line + """ + filenames = check_output(CMD_ALL_FILES, shell=True).decode("utf8").strip().split("\n") + failed_tests = 0 + for filename in filenames: + file_meta = FileMeta(filename) + if file_meta.permissions == ALLOWED_PERMISSION_EXECUTABLES: + shebang = open(filename, "rb").readline().rstrip(b"\n") + + # For any file with executable permissions the first line must contain a shebang + if shebang[:2] != b"#!": + print( + f"""File "{filename}" has permission {ALLOWED_PERMISSION_EXECUTABLES} (executable) and is thus expected to contain a shebang '#!'. Add shebang or do "chmod {ALLOWED_PERMISSION_NON_EXECUTABLES} {filename}" to make it non-executable.""" + ) + failed_tests += 1 + + # For certain file extensions that have been defined, we also check that the shebang conforms to a specific + # allowable set of shebangs + if file_meta.extension in ALLOWED_EXECUTABLE_SHEBANG.keys(): + if shebang not in ALLOWED_EXECUTABLE_SHEBANG[file_meta.extension]: + print( + f"""File "{filename}" is missing expected shebang """ + + " or ".join( + [ + x.decode("utf-8") + for x in ALLOWED_EXECUTABLE_SHEBANG[file_meta.extension] + ] + ) + ) + failed_tests += 1 + + elif file_meta.permissions == ALLOWED_PERMISSION_NON_EXECUTABLES: + continue + else: + print( + f"""File "{filename}" has unexpected permission {file_meta.permissions}. Do "chmod {ALLOWED_PERMISSION_NON_EXECUTABLES} {filename}" (if non-executable) or "chmod {ALLOWED_PERMISSION_EXECUTABLES} {filename}" (if executable).""" + ) + failed_tests += 1 + + return failed_tests + + +def check_shebang_file_permissions() -> int: + """ + Checks every file that contains a shebang line to ensure it has an executable permission + """ + filenames = check_output(CMD_SHEBANG_FILES, shell=True).decode("utf8").strip().split("\n") + + # The git grep command we use returns files which contain a shebang on any line within the file + # so we need to filter the list to only files with the shebang on the first line + filenames = [filename.split(":1:")[0] for filename in filenames if ":1:" in filename] + + failed_tests = 0 + for filename in filenames: + file_meta = FileMeta(filename) + if file_meta.permissions != ALLOWED_PERMISSION_EXECUTABLES: + # These file types are typically expected to be sourced and not executed directly + if file_meta.full_extension in ["bash", "init", "openrc", "sh.in"]: + continue + + # *.py files which don't contain an `if __name__ == '__main__'` are not expected to be executed directly + if file_meta.extension == "py": + file_data = open(filename, "r", encoding="utf8").read() + if not re.search("""if __name__ == ['"]__main__['"]:""", file_data): + continue + + print( + f"""File "{filename}" contains a shebang line, but has the file permission {file_meta.permissions} instead of the expected executable permission {ALLOWED_PERMISSION_EXECUTABLES}. Do "chmod {ALLOWED_PERMISSION_EXECUTABLES} {filename}" (or remove the shebang line).""" + ) + failed_tests += 1 + return failed_tests + + +def main() -> NoReturn: + failed_tests = 0 + failed_tests += check_all_filenames() + failed_tests += check_source_filenames() + failed_tests += check_all_file_permissions() + failed_tests += check_shebang_file_permissions() + + if failed_tests: + print( + f"ERROR: There were {failed_tests} failed tests in the lint-files.py lint test. Please resolve the above errors." + ) + sys.exit(1) + else: + sys.exit(0) + + +if __name__ == "__main__": + main() diff --git a/test/lint/lint-files.sh b/test/lint/lint-files.sh new file mode 100755 index 0000000000..1e115778bd --- /dev/null +++ b/test/lint/lint-files.sh @@ -0,0 +1,7 @@ +#!/usr/bin/env bash + +export LC_ALL=C + +set -e +cd "$(dirname $0)/../.." +test/lint/lint-files.py diff --git a/test/lint/lint-shebang.sh b/test/lint/lint-shebang.sh deleted file mode 100755 index 26843a50e1..0000000000 --- a/test/lint/lint-shebang.sh +++ /dev/null @@ -1,24 +0,0 @@ -#!/usr/bin/env bash -# Copyright (c) 2018-2020 The Bitcoin Core developers -# Distributed under the MIT software license, see the accompanying -# file COPYING or http://www.opensource.org/licenses/mit-license.php. - -# Assert expected shebang lines - -export LC_ALL=C -EXIT_CODE=0 -for PYTHON_FILE in $(git ls-files -- "*.py" | grep -vE "^src/(dashbls|immer)/"); do - if [[ $(head -c 2 "${PYTHON_FILE}") == "#!" && - $(head -n 1 "${PYTHON_FILE}") != "#!/usr/bin/env python3" ]]; then - echo "Missing shebang \"#!/usr/bin/env python3\" in ${PYTHON_FILE} (do not use python or python2)" - EXIT_CODE=1 - fi -done -for SHELL_FILE in $(git ls-files -- "*.sh" | grep -vE "^src/(dashbls|immer)/"); do - if [[ $(head -n 1 "${SHELL_FILE}") != "#!/usr/bin/env bash" && - $(head -n 1 "${SHELL_FILE}") != "#!/bin/sh" ]]; then - echo "Missing expected shebang \"#!/usr/bin/env bash\" or \"#!/bin/sh\" in ${SHELL_FILE}" - EXIT_CODE=1 - fi -done -exit ${EXIT_CODE}