The way to hold a historical past of file adjustments with C++

October 11, 2022

1

Right here, I’ll present you how one can hold observe of file adjustments in a listing and retailer them in Reduct Storage through the use of its C++ consumer SDK. Yow will discover the complete working instance right here.

Working Reduct Storage

In the event you’re a Linux person, the best option to run the storage engine is Docker. That is an instance of a docker-compose.yml file:

companies:
  reduct-storage:
    picture: reductstorage/engine:v1.0.1
    volumes:
      - ./knowledge:/knowledge
    surroundings:
      RS_LOG_LEVEL: DEBUG
    ports:
      - 8383:8383

You can also obtain binaries and run them:

RS_DATA_PATH=./knowledge reduct-storage

If every part is okay, it is best to see Internet Console on http://127.0.0.1:8383.

Putting in Reduct Storage SDK for C++

At present, you possibly can solely construct and set up the library manually. Observe this instruction.

File Watcher in C++

The SDK offers cmake discover script, so you possibly can simply combine it in your CMake mission. That is the instance of your CMakeLists.txt:

cmake_minimum_required(VERSION 3.23)
mission(file_watcher_example)

set(CMAKE_CXX_STANDARD 20)

find_package(ReductCpp 1.0.1)
find_package(ZLIB)
find_package(OpenSSL)

add_executable(file_watcher primary.cc)
target_link_libraries(file_watcher 
  ${REDUCT_CPP_LIBRARIES} ${ZLIB_LIBRARIES} 
  OpenSSL::SSL OpenSSL::Crypto)

Now we’re prepared to put in writing C++ code. Our primary.cc file:

#embody <reduct/consumer.h>

#embody <filesystem>
#embody <fstream>
#embody <iostream>
#embody <map>
#embody <regex>
#embody <thread>

constexpr std::string_view kReductStorageUrl = "http://127.0.0.1:8383";
constexpr std::string_view kWatchedPath = "./";

namespace fs = std::filesystem;

int primary() {
  utilizing ReductClient = reduct::IClient;
  utilizing ReductBucket = reduct::IBucket;

  auto consumer = ReductClient::Construct(kReductStorageUrl);

  auto [bucket, err] = consumer->GetOrCreateBucket(
      "watched_files", ReductBucket::Settings{
                           .quota_type = ReductBucket::QuotaType::kFifo,
                           .quota_size = 100'000'000,  // 100Mb
                       });
  if (err) {
    std::cerr << "Didn't create bucket" << err << std::endl;
    return -1;
  }

  std::cout << "Create bucket" << std::endl;

  std::map<std::string, fs::file_time_type> file_timestamp_map;
  for (;;) {
    for (auto& file : fs::directory_iterator(kWatchedPath)) {
      bool is_changed = false;
      // examine solely information
      if (!fs::is_regular_file(file)) {
        proceed;
      }

      const auto filename = file.path().filename().string();
      auto ts = fs::last_write_time(file);

      if (file_timestamp_map.incorporates(filename)) {
        auto& last_ts = file_timestamp_map[filename];
        if (ts != last_ts) {
          is_changed = true;
        }
        last_ts = ts;
      } else {
        file_timestamp_map[filename] = ts;
        is_changed = true;
      }

      if (!is_changed) {
        proceed;
      }

      std::string alias = filename;
      std::regex_replace(
          alias.start(), filename.start(), filename.finish(), std::regex("."),
          "_");  // we use filename as an entyr identify. It will probably't include dots.
      std::cout << "`" << filename << "` is modified. Storing as `" << alias
                << "` ";

      std::ifstream changed_file(file.path());
      if (!changed_file) {
        std::cerr << "Failed open file";
        return -1;
      }

      auto file_size = fs::file_size(file);

      auto write_err = bucket->Write(
          alias, std::chrono::file_clock::to_sys(ts),
          [file_size, &changed_file](ReductBucket::WritableRecord* rec) {
            rec->Write(file_size, [&](size_t offest, size_t dimension) {
              std::string buffer;
              buffer.resize(dimension);
              changed_file.learn(buffer.knowledge(), dimension);
              std::cout << "." << std::flush;
              return std::pair{offest + dimension <= file_size, buffer};
            });
          });

      if (write_err) {
        std::cout << " Err:" << write_err << std::endl;
      } else {
        std::cout << " OK (" << file_size / 1024 << " kB)" << std::endl;
      }
    }

    std::this_thread::sleep_for(std::chrono::milliseconds(100));
  }
  return 0;
}

Okay, it has fairly many traces however don’t be concerned this can be a easy program. Let us take a look at the code intimately.

Making a Bucket

To start out writing to the database, we should create a bucket:

  auto consumer = ReductClient::Construct(kReductStorageUrl);
  auto [bucket, err] = consumer->GetOrCreateBucket(
      "watched_files", ReductBucket::Settings{
                           .quota_type = ReductBucket::QuotaType::kFifo,
                           .quota_size = 100'000'000,  // 100Mb
                       });
  if (err) {
    std::cerr << "Didn't create bucket" << err << std::endl;
    return -1;
  }

Right here we construct a consumer which ought to use a storage engine with the kReductStorageUrl URL. Then we create a bucket with the watched_files identify or get an current one. Concentrate, we offer some settings as nicely to restrict it dimension with 100Mb, in order that the storage engine begins eradicating outdated knowledge once we attain this quota.
The SDK would not throw any exceptions. Every technique returns reduct::Error or reduct::End result<T>, so you possibly can simply examine the end in your code and print error messages.

Watching Recordsdata

We implement the file watcher in a simple method:

  std::map<std::string, fs::file_time_type> file_timestamp_map;
  for (;;) {
    for (auto& file : fs::directory_iterator(kWatchedPath)) {
      bool is_changed = false;
      // examine solely information
      if (!fs::is_regular_file(file)) {
        proceed;
      }

      const auto filename = file.path().filename().string();
      auto ts = fs::last_write_time(file);

      if (file_timestamp_map.incorporates(filename)) {
        auto& last_ts = file_timestamp_map[filename];
        if (ts != last_ts) {
          is_changed = true;
        }
        last_ts = ts;
      } else {
        file_timestamp_map[filename] = ts;
        is_changed = true;
      }

      if (!is_changed) {
        proceed;
      }

      // Storing a modified file...

      std::this_thread::sleep_for(
std::chrono::milliseconds(100));

}

We journey by way of a given listing fs::directory_iterator(kWatchedPath) and hold the final modification time of every file within the file_timestamp_map map. Whether it is new (wasn’t within the map) or it’s modified (timestamp is totally different), we set the is_changed flag to begin storing the modified file.

Do not forget to sleep some time on the finish of every cycle to keep away from overloading a CPU.

Storing Recordsdata

A historical past of a file is represented as an entry in Reduct Storage. As a result of an entry identify cannot have “.” we must always exchange them in our file names:

  std::string alias = filename;
      std::regex_replace(
          alias.start(), filename.start(), filename.finish(), std::regex("."),
          "_");  // we use filename as an entyr identify. It will probably't include dots.
      std::cout << "`" << filename << "` is modified. Storing as `" << alias
                << "` ";

Then we open the modified file for studying:

     std::ifstream changed_file(file.path());
      if (!changed_file) {
        std::cerr << "Failed open file";
        return -1;
      }

And write it chunkwise to the storage engine:

      auto file_size = fs::file_size(file);
      auto write_err = bucket->Write(
          alias, std::chrono::file_clock::to_sys(ts),
          [file_size, &changed_file](ReductBucket::WritableRecord* rec) {
            rec->Write(file_size, [&](size_t offest, size_t dimension) {
              std::string buffer;
              buffer.resize(dimension);
              changed_file.learn(buffer.knowledge(), dimension);
              std::cout << "." << std::flush;
              return std::pair{offest + dimension <= file_size, buffer};
            });
          });

As you possibly can see, it is fairly verbose, however we ship information with little chunks, and we are able to ship terabytes with none worries about reminiscence! In the event you put an enormous file into your watched listing, you possibly can see how briskly Reduct Storage is.

Getting Knowledge

You may get the information through the use of Bucket::Question technique. You can also use Python or JavaScript Shopper SDKs, and even wget:

wget http://127.0.0.1/api/v1/b/watched_files/<File-Title>

I hope it was useful! Thanks!

Previous articleCreating visualizations with D3 and TypeScript

The way to hold a historical past of file adjustments with C++

Working Reduct Storage

Putting in Reduct Storage SDK for C++

File Watcher in C++

Making a Bucket

Watching Recordsdata

Storing Recordsdata

Getting Knowledge

Introduction to Bitwise Algorithms – Information Buildings and Algorithms Tutorial

How Xdebug Can Assist You Grow to be a Higher WordPress Developer

NSA’s and CISA’s latest safety steering: The great and the dangerous

LEAVE A REPLY Cancel reply

Most Popular

Creating visualizations with D3 and TypeScript

The place to Purchase Nvidia’s GeForce RTX 4090

Mechanisms of Authenticating to a Linux VM (EC2 Occasion) on AWS | by Teri Radichel | Cloud Safety | Oct, 2022

Software For Safety Consultants To Simply Conduct Not Solely Microsoft 365, However Additionally Azure Subscriptions And Azure Energetic Listing Safety Configuration Opinions

Recent Comments

ABOUT US

POPULAR POSTS

Creating visualizations with D3 and TypeScript

The place to Purchase Nvidia’s GeForce RTX 4090

Mechanisms of Authenticating to a Linux VM (EC2 Occasion) on AWS | by Teri Radichel | Cloud Safety | Oct, 2022

POPULAR CATEGORY