Speaker

Vincenzo Lombardo

Vincenzo Lombardo

Operations Manager and Team Leader Apache Nifi

Operations Manager e Team Leader Apache Nifi

Pisa, Italy

Actions

Focusing on No-Code solutions, I specialize in the NiFi ecosystem (NiFi, MiNiFi, NiFi Registry, C2 Server, NiFiKop) for ETL and systems integration.

My background includes expertise in Search technologies (Google Search Appliance, Mindbreeze, OpenSearch, Elasticsearch, and the ELK stack) and security log indexing with Wazuh and Logstash.

Segue le tecnologie legate all’area No Code, con un focus principale sull’ecosistema NiFi (NiFi, MiNiFi, NiFi Registry, C2 Server, NiFiKop) e sul suo utilizzo in ambito ETL e nell’integrazione con altri sistemi informativi.

Ha esperienza nel campo della ricerca (Google Search Appliance, Mindbreeze, OpenSearch, oltre a Elasticsearch e allo stack ELK) e nell’indicizzazione dei log in ambito sicurezza (Wazuh e Logstash).

Area of Expertise

  • Information & Communications Technology

Topics

  • apache nifi
  • minifi

Distributed OpenSearch Monitoring at Scale with Apache NiFi and MiNiFi Agents

As OpenSearch clusters grow beyond 10–20 nodes, centralized monitoring becomes a bottleneck: single points of failure, API overload, and higher latency. Traditional approaches don’t scale.

This session presents a production-ready distributed monitoring architecture using Apache NiFi and MiNiFi. MiNiFi agents on each node collect local metrics via a custom NodeStatsProcessor (CPU, heap, JVM, I/O, thread pools), while central NiFi collectors aggregate and deduplicate cluster-wide metrics using a custom ClusterStatsProcessor and forward them to OpenSearch.

Results include linear scalability, sub-millisecond node metrics, HA, and minimal overhead. Key insights: separating local vs cluster-wide collection, deployment patterns (bare metal, VMs, containers), HA strategies with multiple NiFi collectors, and lessons from production clusters of 10–50+ nodes processing millions of metrics daily with 99.9% reliability.

Ideal for OpenSearch operators managing 10+ nodes, platform engineers building observability pipelines, and anyone hitting centralized monitoring limits. Walk away with a distributed architecture you can implement immediately.

PutOpenSearchVector QueryOpenSearchVector: use case of two Python NiFi processors for OpenSearch

With NiFi version 2.x, the ability to interface with OpenSearch using pre-built Python components (processors) was introduced.

These processors address various needs related to AI solutions, specifically vector embedding using tools like ChatGPT or HuggingFace. The selection and implementation of these components were partly driven by the increasing interest in these topics in recent times.

In the presentation, we will showcase an ETL flow for populating an OpenSearch index with vector fields. The flow is built using NiFi for both data acquisition and vector embedding generation, as well as querying via the dedicated OpenSearch components.

A growing trend in recent times is the demand for AI systems that use local engines, independent of external access.

As an example of possible application, we will present a solution based on customizing PutOpenSearchVector QueryOpenSearchVector processors to perform vector embedding and search using local LLMs based on Ollama.

Vincenzo Lombardo

Operations Manager and Team Leader Apache Nifi

Pisa, Italy

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top