I'm Filip, a Machine Learning Engineer and DevRel passionate about technology, AI, philosophy, art. Part-time fashion model.
- Superlinked - Building Small model inference for AI search, retrieval, and agents.
- Applied AI research - Contributing to a research paper for a technique that improves the transformer architecture using FlashNorm by removing the weights from RMSNorm and merging them with the next linear layer
- Custom Superlinked retriever in Langchain - A PyPi package that implements a custom superlinked mixture-of-encoders retriever in Langchain
- Llamahub: Superlinked x Llamaindex - A custom Llamaindex retriever that uses Superlinked
- FlashNorm GPU Benchmark - Measures the speedup from deferred normalization (GEMM || inv_rms overlap) on an NVIDIA T4 GPU using CUDA streams and a custom Triton kernel achieving +12-14% speedup at Llama-7B scale
- What actually makes embedding inference fast? article
- Embeddings theory, matrix maths, and research by Google DeepMind @Haystack EU 2025 video
- A Practical Guide on Choosing a Vector Database article
- Mixture of Encoders @Berlin Buzzwords 2025 video
- Beyond Multimodal Vectors: Hotel Search With Superlinked and Qdrant video
- Spotify Song Recommendation Tool - Uses transformer models and vector search to recommend songs based on moods
- AdalFlow - An LLM application library that helps developers build and optimize LLM task pipelines
- Multiple Data Science and ML roles in retail, fintech, and biomedical AI
- Large Language Models (LLMs), Machine Learning Systems, Vector Search & Embeddings, Tech startups


