clang-tidy customized checker example

前言

 
本文给出了一个自定义clang-tidy checker的实例,并介绍了引入该实例的背景和原因,旨在帮助读者以此为蓝本快速开发own clang-tidy checker。
本文并不对clang-tidy的基本使用做过多介绍,不熟悉的读者可以阅读官方文档
此外,本文同样可以视作对write our own checkers的补充说明。


具体问题

问题背景

在近期工作中,我们的c++研发偶遇了一个问题:common code在某些编译条件下匹配到了不符预期的函数重载,因而引入了较难排查的运行期错误。

具体而言,有demo code如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// demo.cpp
#include <iostream>
#include <string>

#ifdef USE_WIDE_STRINGS
typedef std::wstring MyString;
#define MY_TEXT(str) L##str
#else
typedef std::string MyString;
#define MY_TEXT(str) str
#endif

class ThirdPartyClass {
public:
ThirdPartyClass(bool value) {
std::cout << "Bool constructor called with value: " << value << std::endl;
}

ThirdPartyClass(const char *value) {
std::cout << "Const char* constructor called with value: " << value
<< std::endl;
}

// should have a ctor with const wchar_t* parameter but not
// ThirdPartyClass(const wchar_t *value) {
// std::cout << "const wchar_t* constructor called with value: " << value << std::endl;
// }
};

void thirdPartyFunction(bool value) {
std::cout << "Bool function called with value: " << value << std::endl;
}

void thirdPartyFunction(const char *value) {
std::cout << "Const char* function called with value: " << value << std::endl;
}

// should have a function with const wchar_t* parameter but not
// void thirdPartyFunction(const wchar_t *value) {
// std::cout << "const wchar_t* function called with value: " << value << std::endl;
// }

int main() {
ThirdPartyClass instance1(true);
ThirdPartyClass instance2("Hello, World");

MyString str = MY_TEXT("Hello, World");
ThirdPartyClass instance3(str.c_str()); // call ThirdPartyClass(bool value) when use wide strings

thirdPartyFunction(true);
thirdPartyFunction("Hello, World");
thirdPartyFunction(str.c_str());// call thirdPartyFunction(bool value) when use wide strings

return 0;
}

问题分析

在上述代码中,我们使用了一个第三方库中的类ThirdPartyClass,该类有多个构造函数。
类似的,我们使用了一个第三方库中函数thirdPartyFunction,该函数有多个重载。

除此之外,我们还定义了一个MyString类型,该类型在不同编译条件下分别是std::stringstd::wstring的别名。

我们预期,在使用MyString str.c_str()作为实参时,ThirdPartyClass的构造函数和thirdPartyFunction的函数总是调用形参是const char*的重载版本。
如果传入的是const wchar_t*,则应当抛出一个build error,以显式地提示开发当前需要对字符串执行编码转换

遗憾的是,当我们使用std::wstring作为MyString并传入c_str()时,ThirdPartyClass的构造函数和thirdPartyFunction的函数将会调用形参是bool的重载版本,原因很简单:任何指针类型都可以隐式转换为bool类型。
在没有任何重载版本匹配const wchar_t*的情况下,编译器会选择bool的重载版本,并且导致后续触发运行期错误(构造了错误的对象或者调用了错误的函数)。由于编译器往往认为这种转换是合理的,因此也不会有任何警告信息。

以下分别为使用std::stringstd::wstring作为MyString时的运行输出:

1
2
3
4
5
6
7
8
# use std::string
clang++ ./demo.cpp && ./a.out
Bool constructor called with value: 1
Const char* constructor called with value: Hello, World
Const char* constructor called with value: Hello, World # meet our expectation
Bool function called with value: 1
Const char* function called with value: Hello, World
Const char* function called with value: Hello, World # meet our expectation

1
2
3
4
5
6
7
8
# use std::wstring
clang++ ./demo.cpp -DUSE_WIDE_STRINGS && ./a.out
Bool constructor called with value: 1
Const char* constructor called with value: Hello, World
Bool constructor called with value: 1 # unexpected, with no warning
Bool function called with value: 1
Const char* function called with value: Hello, World
Bool function called with value: 1 # unexpected, with no warning

解决方案

我们可以通过 加强培训 + code review + 高覆盖度单元测试 来避免这种问题,但以上方式需要付出巨大的成本和心智负担。
因此,我们希望简单轻松地解决此问题,至少我们期待可以将错误从运行期提前到编译期

以下为一些可能的解决方案:

  1. 修改第三方库
    针对这个case,我们可以考虑修改第三方库,为ThirdPartyClass添加一个const wchar_t*的构造函数,为thirdPartyFunction添加一个const wchar_t*的重载版本。
    如此,当我们使用std::wstring作为MyString时,编译器将会选择形参为const wchar_t*的最优重载,从而避免运行期错误。
  2. 自定义clang-tidy checker
    我们可以考虑自定义一个clang-tidy checker,用于检查所有存在const wchar_t*在函数调用或对象构造过程中转为bool的场景,并且在检测到相关场景后抛出build warning(或者更进一步地,视作build error)。

方案1的优点无需赘述,但其缺点也显而易见:

  • 大多数场景下,第三方库无法直接修改
  • 后续引入新的第三方库时,需要重复这个过程以避免再次出现类似问题

因此本文将关注于更加通用化的方案2,即使用自定义clang-tidy checker。


自定义clang-tidy checker

前置知识

clang-tidy-extra
AST Matchers

实现思路

clang-tidy新增checker有2种方式

  1. 新建checkers,并将其编译成动态链接库,然后通过令clang-tidy以-load参数加载plugin
  2. 新建checkers,并将checkers编译进clang-tidy二进制文件中

方案1也就是所谓的Out-of-tree check plugins
显然,相较于方案2需要rebuild clang-tidy,方案1更加灵活。
此外,由于中文互联网社区对方案1的描述较少,因此本文将仅关注方案1,并给出方案1的demo级具体实现。
方案2可参考为clang-tidy添加自定义check

具体实现

1
2
3
4
5
6
# 目录结构
.
├── CMakeLists.txt
├── WCharToBoolConversionCheck.cpp
├── WCharToBoolConversionCheck.h
└── build_and_test.sh # for test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# CMakeLists.txt
cmake_minimum_required(VERSION 3.10)
project(WCharToBoolConversionCheckPlugin)

set(CMAKE_CXX_STANDARD 17)

find_package(LLVM REQUIRED CONFIG)
find_package(Clang REQUIRED CONFIG)

include_directories(${LLVM_INCLUDE_DIRS})
include_directories(${CLANG_INCLUDE_DIRS})

add_definitions(${LLVM_DEFINITIONS})

add_library(WCharToBoolConversionCheck SHARED WCharToBoolConversionCheck.cpp)
target_link_libraries(WCharToBoolConversionCheck PRIVATE clangASTMatchers clangTidy)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// WCharToBoolConversionCheck.h
#ifndef WCHAR_TO_BOOL_CONVERSION_CHECK_H
#define WCHAR_TO_BOOL_CONVERSION_CHECK_H

#include "clang-tidy/ClangTidyCheck.h"

namespace clang {
namespace tidy {

class WCharToBoolConversionCheck : public ClangTidyCheck {
public:
WCharToBoolConversionCheck(StringRef Name, ClangTidyContext *Context)
: ClangTidyCheck(Name, Context) {}

void registerMatchers(ast_matchers::MatchFinder *Finder) override;
void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
};

} // namespace tidy
} // namespace clang

#endif
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// WCharToBoolConversionCheck.cpp
#include "clang-tidy/ClangTidy.h"
#include "clang-tidy/ClangTidyCheck.h"
#include "clang-tidy/ClangTidyModule.h"
#include "clang-tidy/ClangTidyModuleRegistry.h"

namespace clang {
namespace tidy {

class WCharToBoolConversionCheck : public ClangTidyCheck {
public:
WCharToBoolConversionCheck(StringRef Name, ClangTidyContext *Context)
: ClangTidyCheck(Name, Context) {}

void registerMatchers(ast_matchers::MatchFinder *Finder) {
using namespace clang::ast_matchers;

// Match function calls and constructor calls with a boolean parameter.
Finder->addMatcher(
callExpr(forEachArgumentWithParam(expr().bind("arg"),
parmVarDecl(hasType(booleanType()))))
.bind("call"),
this);

Finder->addMatcher(
cxxConstructExpr(
forEachArgumentWithParam(expr().bind("arg"),
parmVarDecl(hasType(booleanType()))))
.bind("construct"),
this);
}

void check(const ast_matchers::MatchFinder::MatchResult &Result) {
const auto *ArgExpr = Result.Nodes.getNodeAs<Expr>("arg");

if (!ArgExpr) {
return;
}

QualType ArgType = ArgExpr->getType().getCanonicalType();

if (const auto *PT = dyn_cast<PointerType>(ArgType)) {
QualType PointeeType = PT->getPointeeType().getCanonicalType();
// If the pointee type is wchar_t || const wchar_t, emit a warning.
if (PointeeType->isWideCharType() ||
(PointeeType.isConstQualified() &&
PointeeType.getUnqualifiedType()->isWideCharType())) {
diag(ArgExpr->getBeginLoc(),
"passing 'wchar_t*' to a boolean parameter, which may lead to "
"unexpected behavior");
}
}
}
};

class WCharToBoolConversionModule : public ClangTidyModule {
public:
void addCheckFactories(ClangTidyCheckFactories &CheckFactories) override {
CheckFactories.registerCheck<WCharToBoolConversionCheck>(
"wchar-to-bool-conversion-check");
}
};

extern "C" ClangTidyModuleRegistry::Add<WCharToBoolConversionModule>
X("wchar-to-bool-conversion-module",
"Adds checks for wchar_t* to bool conversions.");

} // namespace tidy
} // namespace clang

test

test script

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# build_and_test.sh

#!/bin/bash
set -eu

SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
BUILD_DIR="$SCRIPT_DIR/build"

rm -rf "${BUILD_DIR}"
mkdir "${BUILD_DIR}"
cd "${BUILD_DIR}"

# build checker plugin

cmake "$SCRIPT_DIR"
make

cd "$SCRIPT_DIR"

LIB_FILE="$BUILD_DIR/libWCharToBoolConversionCheck.dylib"
DEMO_FILE="$SCRIPT_DIR/demo.cpp"

# start test

echo "clang-tidy demo.cpp use std::string"
clang-tidy --checks='-*,wchar-to-bool-conversion-check' -load "$LIB_FILE" "$DEMO_FILE" --

echo "clang-tidy demo.cpp use std::wstring"
clang-tidy --checks='-*,wchar-to-bool-conversion-check' -load "$LIB_FILE" "$DEMO_FILE" -- -DUSE_WIDE_STRINGS

test result

1
2
3
4
5
6
7
8
9
10
# ./build_and_test.sh
clang-tidy demo.cpp use std::string # no warning
clang-tidy demo.cpp use std::wstring # report 2 warnings
2 warnings generated.
/Users/XanderLiu/myclang/demo.cpp:55:29: warning: passing 'wchar_t*' to a boolean parameter, which may lead to unexpected behavior [wchar-to-bool-conversion-check]
55 | ThirdPartyClass instance3(str.c_str());
| ^
/Users/XanderLiu/myclang/demo.cpp:59:22: warning: passing 'wchar_t*' to a boolean parameter, which may lead to unexpected behavior [wchar-to-bool-conversion-check]
59 | thirdPartyFunction(str.c_str());
| ^

附录

运行环境

Target: arm64-apple-darwin22.6.0
Homebrew clang version: 17.0.5
cmake version: 3.27.0

src获取与运行

  1. installed llvm && clang
  2. git clone git@github.com:zsmj2017/ClangTidyCustomizedCheckersExample.git
  3. cd ClangTidyCustomizedCheckersExample
  4. run ./build_and_test.sh or write your own test script