Ideally, you should clone the project https://github.com/SuperKogito/fastft and then download the release and place it under https://github.com/SuperKogito/fastft/tree/main/comparison. Look at https://github.com/SuperKogito/fastft/blob/main/comparison/README.md for the detailed build steps. Once the project is built, you can compare the fastft results to librosa. Alternatively, you can build https://github.com/SuperKogito/fastft/tree/main/example/cMOSNet for a clearer usecase of fastft (this is presented in the YouTube video).
About
CPU-based low-latency inference demands efficient and rapid solutions. The Short Time Fourier Transform (STFT) is a common tool in audio AI tasks. However, there is currently no standard implementation in C that facilitates fast and efficient inference. Fastft aims to address this gap by offering an implementation based on the Fastest Fourier Transform in the West (FFTW). This implementation is suitable for Spectrogram/STFT-based inference (e.g., models like Spleeter, MOSnet), and it can also be extended to cover feature extraction algorithms such as MFCC. While some deep learning libraries offer the option of incorporating STFT into the model, these implementations often differ and may restrict developer flexibility—two critical considerations when targeting embedded hardware.