Infrastructure enhancement
Security
Current infrastructure configuration is definitely not for production environment, here is want can be done to make it more secure:
Configure TLS between Logstash and Kafka.
Configure TLS between Kafka and Flink.
Configure HTTPS for Elasticsearch.
Configure Authentication and Role-Based access for Elasticsearch.
Configure Authentication and Role-Based access for Kibana.
Reliability and performance
Configure Logstash pipeline.
Configure multiple Kafka Brokers and multiple Nodes.
Configure Flink cluster.
Configure Elasticsearch cluster.
DataStream enhancement
For the moment filtered events are not sent to Elasticsearch though the Sink. DataStream needs to be split and then events matching filters will be sent to asynchronous processing stream and events not matching filters will be sent directly though the Elasticsearch Sink.
Asynchronous requests enhancement
Shodan Java client don't use asynchronous HTTP request. So when number of API call exceed 50 requests/second, it can cause timeout.
For each external sources, responses have to be managed in a better way. The program must handle the case when the number of credits is insufficient and no results are returned.
It must also handle the case where to many requests per second are done and connections are rejected.
Last updated