之前我自己尝试实现了一个类似Everything的软件File-Engine,并将其开源到了GitHub,其中文件同步之前是通过readDirectChanges函数进行文件监控并同步的方法。但是这样的方法在监控整个磁盘时好像会漏掉一些文件。
下面介绍另一种方法,通过读取USN日志来进行文件的监控。
代码已经开源到GitHub,之前的ReadDirectoryChanges API的版本也有保存。
File-Engine/C++/fileMonitor at master · XUANXUQAQ/File-Engine (github.com)
Aiverything
我现在正在制作另一个项目Aiverything,如果大家有兴趣也欢迎来尝试一下,相比File-Engine搜索速度更快,更易用的UI界面,以及插件系统。
Aiverything | Launcher to your everything
也可以去github.com/panwangwin/... 进行下载,如果你遇到任何bug或者有任何建议,欢迎反馈给我们。
也可以加入我们的QQ交流群:893463594,欢迎进群交流学习,聊聊天也可以~
代码以及资料参考自
windows - USN NFTS change notification event interrupt - Stack Overflow
c++ - How can I detect only deleted, changed, and created files on a volume? - Stack Overflow
原理
Obtaining Directory Change Notifications - Win32 apps | Microsoft Learn
在微软官网这篇文章中,详细写了如何获取文件夹的变化通知。
Change Journals - Win32 apps | Microsoft Learn
Keeping an Eye on Your NTFS Drives: the Windows 2000 Change Journal Explained | Microsoft Learn
这里详细介绍了NTFS的usn日志是什么,以及usn日志的数据结构等。
简单来说,每当一个文件进行变动,都会写入usn日志。我们可以通过监控是否有新的usn日志记录写入来判断是否有文件更改,并进行监控。
实现
定义监控类
首先定义一个NTFSChangesWatcher类
cpp
#pragma once
#include <memory>
#include <string>
#include <Windows.h>
class NTFSChangesWatcher
{
public:
NTFSChangesWatcher(char drive_letter);
~NTFSChangesWatcher() = default;
// Method which runs an infinite loop and waits for new update sequence number in a journal.
// The thread is blocked till the new USN record created in the journal.
void WatchChanges(const bool* flag, void(*)(const std::u16string&), void(*)(const std::u16string&));
private:
HANDLE OpenVolume(char drive_letter);
bool CreateJournal(HANDLE volume);
bool LoadJournal(HANDLE volume, USN_JOURNAL_DATA* journal_data);
bool WaitForNextUsn(PREAD_USN_JOURNAL_DATA read_journal_data) const;
std::unique_ptr<READ_USN_JOURNAL_DATA> GetWaitForNextUsnQuery(USN start_usn);
bool ReadJournalRecords(PREAD_USN_JOURNAL_DATA journal_query, LPVOID buffer,
DWORD& byte_count) const;
USN ReadChangesAndNotify(USN low_usn, char* buffer, void(*)(const std::u16string&), void(*)(const std::u16string&));
std::unique_ptr<READ_USN_JOURNAL_DATA> GetReadJournalQuery(USN low_usn);
void showRecord(std::u16string& full_path, USN_RECORD* record);
char drive_letter_;
HANDLE volume_;
std::unique_ptr<USN_JOURNAL_DATA> journal_;
DWORDLONG journal_id_;
USN last_usn_;
USN max_usn_;
// Flags, which indicate which types of changes you want to listen.
static const int FILE_CHANGE_BITMASK;
static const int kBufferSize;
};
对外的接口函数为WatchChanges
cpp
void WatchChanges(const bool* flag, void(*)(const std::u16string&), void(*)(const std::u16string&));
函数有三个参数,第一个为停止监控文件标志,当设置为false将会退出循环。第二个参数为当新增文件时的回调函数指针,第三个参数为删除文件时的回调函数指针。
初始化USN日志
cpp
const int NTFSChangesWatcher::kBufferSize = 1024 * 1024 / 2;
const int NTFSChangesWatcher::FILE_CHANGE_BITMASK = USN_REASON_RENAME_NEW_NAME | USN_REASON_RENAME_OLD_NAME;
NTFSChangesWatcher::NTFSChangesWatcher(char drive_letter) :
drive_letter_(drive_letter)
{
volume_ = OpenVolume(drive_letter_);
journal_ = std::make_unique<USN_JOURNAL_DATA>();
if (const bool res = LoadJournal(volume_, journal_.get()); !res) {
fprintf(stderr, "Failed to load journal");
return;
}
max_usn_ = journal_->MaxUsn;
journal_id_ = journal_->UsnJournalID;
last_usn_ = journal_->NextUsn;
}
首先通过OpenVolume打开磁盘,并返回一个HANDLE,然后分配存储日志的内存空间,接着通过LoadJournal读取usn日志。
cpp
HANDLE NTFSChangesWatcher::OpenVolume(const char drive_letter)
{
wchar_t pattern[10] = L"\\\\?\\a:";
pattern[4] = static_cast<wchar_t>(drive_letter);
const HANDLE volume = CreateFile(
pattern, // lpFileName
// also could be | FILE_READ_DATA | FILE_READ_ATTRIBUTES | SYNCHRONIZE
GENERIC_READ | GENERIC_WRITE | SYNCHRONIZE, // dwDesiredAccess
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, // share mode
nullptr, // default security attributes
OPEN_EXISTING, // disposition
// It is always set, no matter whether you explicitly specify it or not. This means, that access
// must be aligned with sector size so we can only read a number of bytes that is a multiple of the sector size.
FILE_FLAG_NO_BUFFERING, // file attributes
nullptr // do not copy file attributes
);
if (volume == INVALID_HANDLE_VALUE) {
// An error occurred!
fprintf(stderr, "Failed to open volume");
return nullptr;
}
return volume;
}
获取HANDLE后,通过LoadJournal获取USN日志,第一次读取失败将会尝试创建后再次尝试读取。
cpp
bool NTFSChangesWatcher::LoadJournal(HANDLE volume, USN_JOURNAL_DATA* journal_data)
{
DWORD byte_count;
// Try to open journal.
if (!DeviceIoControl(volume,
FSCTL_QUERY_USN_JOURNAL,
nullptr,
0,
journal_data,
sizeof(*journal_data),
&byte_count,
nullptr))
{
// If failed (for example, in case journaling is disabled), create journal and retry.
if (CreateJournal(volume)) {
return LoadJournal(volume, journal_data);
}
return false;
}
return true;
}
bool NTFSChangesWatcher::CreateJournal(HANDLE volume)
{
DWORD byte_count;
CREATE_USN_JOURNAL_DATA create_journal_data{};
const bool ok = DeviceIoControl(volume, // handle to volume
FSCTL_CREATE_USN_JOURNAL, // dwIoControlCode
&create_journal_data, // input buffer
sizeof(create_journal_data), // size of input buffer
nullptr, // lpOutBuffer
0, // nOutBufferSize
&byte_count, // number of bytes returned
nullptr) != 0; // OVERLAPPED structure
if (!ok) {
// An error occurred!
}
return ok;
}
开始监控
初始化完成之后就可以调用WatchChanges函数
cpp
void NTFSChangesWatcher::WatchChanges(const bool* flag,
void(*file_added_callback_func)(const std::u16string&),
void(*file_removed_callback_func)(const std::u16string&))
{
const auto u_buffer = std::make_unique<char[]>(kBufferSize);
const auto read_journal_query = GetWaitForNextUsnQuery(last_usn_);
while (*flag)
{
// This function does not return until new USN record created.
WaitForNextUsn(read_journal_query.get());
last_usn_ = ReadChangesAndNotify(read_journal_query->StartUsn,
u_buffer.get(),
file_added_callback_func,
file_removed_callback_func);
read_journal_query->StartUsn = last_usn_;
}
delete flag;
}
核心的方法就两个,一个WaitForNextUsn,一个ReadChangesAndNotify
首先来看WaitForNextUsn
cpp
bool NTFSChangesWatcher::WaitForNextUsn(PREAD_USN_JOURNAL_DATA read_journal_data) const
{
DWORD bytes_read;
// This function does not return until new USN record created.
const bool ok = DeviceIoControl(volume_,
FSCTL_READ_USN_JOURNAL,
read_journal_data,
sizeof(*read_journal_data),
&read_journal_data->StartUsn,
sizeof(read_journal_data->StartUsn),
&bytes_read,
nullptr) != 0;
return ok;
}
通过DeviceIoControl函数,发送FSCTL_READ_USN_JOURNAL事件,由于我们之前初始化的时候设置了从最后一个usn记录开始读取,这时该方法将会阻塞直到用户进行操作,NTFS写入一个新的USN日志。
这里的最后一个参数lpOverlapped必须为NULL,因为我们要监控文件的变化,需要阻塞函数,如果是异步调用反而会有各种各样的不方便。
关于DeviceIoControl函数网上已经有很多解释,这里就放个msdn吧。
DeviceIoControl function (ioapiset.h) - Win32 apps | Microsoft Learn
以及FSCTL_READ_USN_JOURNAL
FSCTL_READ_USN_JOURNAL - Win32 apps | Microsoft Learn
当该方法返回后,代表磁盘中出现了一个新的usn记录,这时就会执行到下一个函数
ReadChangesAndNotify
cpp
USN NTFSChangesWatcher::ReadChangesAndNotify(USN low_usn,
char* buffer,
void(*file_added_callback_func)(const std::u16string&),
void(*file_removed_callback_func)(const std::u16string&))
{
DWORD byte_count;
const auto journal_query = GetReadJournalQuery(low_usn);
memset(buffer, 0, kBufferSize);
if (!ReadJournalRecords(journal_query.get(), buffer, byte_count))
{
// An error occurred.
return low_usn;
}
auto record = reinterpret_cast<USN_RECORD*>(reinterpret_cast<USN*>(buffer) + 1);
const auto record_end = reinterpret_cast<USN_RECORD*>(reinterpret_cast<BYTE*>(buffer) + byte_count);
std::u16string full_path;
for (; record < record_end;
record = reinterpret_cast<USN_RECORD*>(reinterpret_cast<BYTE*>(record) + record->RecordLength))
{
const auto reason = record->Reason;
full_path.clear();
// It is really strange, but some system files creating and deleting at the same time.
if ((reason & USN_REASON_FILE_CREATE) && (reason & USN_REASON_FILE_DELETE))
{
continue;
}
if ((reason & USN_REASON_FILE_CREATE) && (reason & USN_REASON_CLOSE))
{
showRecord(full_path, record);
file_added_callback_func(full_path);
}
else if ((reason & USN_REASON_FILE_DELETE) && (reason & USN_REASON_CLOSE))
{
showRecord(full_path, record);
file_removed_callback_func(full_path);
}
else if (reason & FILE_CHANGE_BITMASK)
{
if (reason & USN_REASON_RENAME_OLD_NAME)
{
showRecord(full_path, record);
file_removed_callback_func(full_path);
}
else if (reason & USN_REASON_RENAME_NEW_NAME)
{
showRecord(full_path, record);
file_added_callback_func(full_path);
}
}
}
return *reinterpret_cast<USN*>(buffer);
}
这里ReadJournalRecords将会调用DeviceIoControl函数发送FSCTL_READ_USN_JOURNAL读出新的USN日志记录。
读取完成后,通过获取USN_RECORD中的reason字段,得到文件是创建,还是被删除。其实还有很多其他的USN_REASON,不过这里由于只需要检测文件变化,因此只监听了
-
USN_REASON_FILE_CREATE
-
USN_REASON_FILE_DELETE
-
USN_REASON_RENAME_OLD_NAME
-
USN_REASON_RENAME_NEW_NAME
所有的原因可以参考这里
USN_RECORD_V2 - Win32 apps | Microsoft Learn
获取文件完整路径
由于USN日志中记录的只有文件名和文件参照号,因此我们需要通过文件参照号和父文件参照号不断向上查询,拼接出完整的路径。
也就是上面的showRecord函数,该函数有两个参数,full_path,USN_RECORD指针类型的record,也就是需要拼接出完整路径的文件记录。
cpp
void NTFSChangesWatcher::showRecord(std::u16string& full_path, USN_RECORD* record)
{
static std::wstring sep_wstr(L"\\");
static std::u16string sep(sep_wstr.begin(), sep_wstr.end());
const indexer_common::FileInfo file_info(*record, drive_letter_);
if (full_path.empty())
{
full_path += file_info.GetName();
}
else
{
full_path = file_info.GetName() + sep + full_path;
}
DWORD byte_count = 1;
auto buffer = std::make_unique<char[]>(kBufferSize);
MFT_ENUM_DATA_V0 med;
med.StartFileReferenceNumber = record->ParentFileReferenceNumber;
med.LowUsn = 0;
med.HighUsn = max_usn_;
if (!DeviceIoControl(volume_,
FSCTL_ENUM_USN_DATA,
&med,
sizeof(med),
buffer.get(),
kBufferSize,
&byte_count,
nullptr))
{
return;
}
auto* parent_record = reinterpret_cast<USN_RECORD*>(reinterpret_cast<USN*>(buffer.get()) + 1);
if (parent_record->FileReferenceNumber != record->ParentFileReferenceNumber)
{
static std::wstring colon_wstr(L":");
static std::u16string colon(colon_wstr.begin(), colon_wstr.end());
std::string drive;
drive += drive_letter_;
auto&& w_drive = string2wstring(drive);
const std::u16string drive_u16(w_drive.begin(), w_drive.end());
full_path = drive_u16 + colon + sep + full_path;
return;
}
showRecord(full_path, parent_record);
}
首先获得文件名和父文件参照号,然后定义一个MFT_ENUM_DATA,由于MFT_ENUM_DATA_V1会报错Error 87,也就是ERROR_INVALID_PARAMETER,所以这里改成了MFT_ENUM_DATA_V0
System Error Codes (0-499) (WinError.h) - Win32 apps | Microsoft Learn
将开始查询地址设置为record->ParentFileReferenceNumber,并将上界设置为最开始初始化的max_usn_。
然后调用DeviceIoControl,发送FSCTL_ENUM_USN_DATA事件,就可以读取出record的父文件夹的USN记录。
这时,将查询出的父文件夹记录再作为record进行递归查询。
不断向上查询,将文件名拼接到full_path中,最后找到顶层退出递归即可。
获得文件完整路径后,即可调用两个回调函数进行处理了。