# POC Prerequisites

{% hint style="info" %}
Note: For this demo, infrastructure is pretty small and simple, there is absolutely no security or reliability functionnalities configured. Don't use those configurations in production.
{% endhint %}

{% hint style="info" %}
Demo has been tested on the latest CentOS 8 version (currently 8.0-1905).&#x20;
{% endhint %}

## Prerequisites

### System

{% tabs %}
{% tab title="CentOS7/8" %}

```bash
yum -y install epel-release
yum update
yum upgrade

vi nano /etc/selinux/config
#Change SELINUX value to
SELINUX=disabled

systemctl stop firewalld
systemctl disable firewalld

reboot
```

{% endtab %}

{% tab title="Others" %}

{% endtab %}
{% endtabs %}

### Elastic repo

Download and install Elastic public signing key :

{% tabs %}
{% tab title="CentOS7/8" %}

```
rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
```

{% endtab %}

{% tab title="Others" %}

{% endtab %}
{% endtabs %}

Create a new yum repo file and add the following lines :

{% tabs %}
{% tab title="CentOS7/8" %}

```
vi /etc/yum.repos.d/elasticsearch.repo

[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
```

{% endtab %}
{% endtabs %}

## Java

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
sudo yum install java-1.8.0-openjdk
```

{% endtab %}
{% endtabs %}

## Elasticsearch

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
yum install elasticsearch
```

{% endtab %}
{% endtabs %}

### Configuration

Modify/add the following lines in Elasticsearch configuration file :

```
sudo vi /etc/elasticsearch/elasticsearch.yml


cluster.name: BOTES
node.name: Glooper
# If you have to change default data path 
# Don't forget to change permissions for data folder
# chown -R elasticsearch:elasticsearch /opt/data/elasticsearch/
path.data: /opt/data/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: localhost
http.port: 9200
cluster.initial_master_nodes: ["$Your_Server_IP"]
```

Configure system and Elasticsearch for JVM memory usage :

```
sudo vi /etc/elasticsearch/jvm.options


-Xms2g
# Xmx must but set no more than 50% of total memory and no more than 32Gb
-Xmx2g
```

{% tabs %}
{% tab title="CentOS7/8" %}

```
sudo vi /etc/security/limits.conf

# Add the following lines
elasticsearch   soft  memlock   unlimited
elasticsearch   hard  memlock   unlimited


sudo vi /etc/sysconfig/elasticsearch

# Modify the folloing lines
MAX_OPEN_FILES=65535
MAX_LOCKED_MEMORY=unlimited


sudo vi /usr/lib/systemd/system/elasticsearch.service

# Add the following line
LimitMEMLOCK=infinity
```

{% endtab %}

{% tab title="Others" %}

```
```

{% endtab %}
{% endtabs %}

Then start and enable Elasticsearch :

```
systemctl start elasticsearch
systemctl enable elasticsearch
```

## Logstash

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
yum install logstash
```

{% endtab %}
{% endtabs %}

### Configuration

Logstash output configuration for Kafka can be downloaded here : <https://botes.s3-us-west-1.amazonaws.com/botes-logstash-configuration/full-config/output-kafka/output-kafka.conf>

Logstash input configurations for BOTES JSON files can be downloaded on Logstash Configuration section here: [BOTES Prerequisites](/botes-dataset/botes-prerequisites.md#logstash-configuration).

Then start and enable Logstash :

```
systemctl start logstash
systemctl enable logstash
```

## Kibana

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
yum install kibana
```

{% endtab %}
{% endtabs %}

### Configuration

Modify/add the following lines in Kibana configuration file :

```
sudo vi /etc/kibana/kibana.yml


server.port: 5601
server.host: "$Your_Server_IP"
server.name: "BOTES"
elasticsearch.hosts: ["http://localhost:9200"]
# Node can be slow if in Raspberry Pi for example
elasticsearch.requestTimeout: 300000
```

Then start and enable Kibana :

```
systemctl start kibana
systemctl enable kibana
```

## Zookeeper

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
cd /opt
wget http://apache-mirror.8birdsvideo.com/zookeeper/stable/apache-zookeeper-3.5.5.tar.gz
tar zxf apache-zookeeper-3.5.5.tar.gz
rm -f apache-zookeeper-3.5.5.tar.gz
ln -s /opt/apache-zookeeper-3.5.5/ /opt/zookeeper

mkdir /opt/zookeeper/logs
mkdir /opt/zookeeper/data

sudo useradd zk -m
sudo usermod --shell /bin/bash zk
sudo usermod -aG sudo zk
sudo chown -R zk:zk /opt/zookeeper/logs/
sudo chown -R zk:zk /opt/zookeeper/data/
```

{% endtab %}
{% endtabs %}

### Configuration

Create and add the following lines to Zookeeper configuration file :

{% tabs %}
{% tab title="CentOS7/8" %}

```
vi /opt/zookeeper/conf/server.configuration


tickTime=2000
dataDir=/opt/zookeeper/data
dataLogDir/opt/zookeeper/logs
clientPort=2181
clientPortAddress=localhost
```

{% endtab %}

{% tab title="Others" %}

{% endtab %}
{% endtabs %}

### Systemd script

Create and add the following lines to Zookeeper systemd script :

{% tabs %}
{% tab title="CentOS7/8" %}

```
[Unit]
Description=Zookeeper Daemon
Documentation=http://zookeeper.apache.org
Requires=network.target
After=network.target


[Service]
Type=forking
WorkingDirectory=/opt/zookeeper
User=zk
Group=zk
ExecStart=/opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/server.configuration
ExecStop=/opt/zookeeper/bin/zkServer.sh stop /opt/zookeeper/conf/server.configuration
ExecReload=/opt/zookeeper/bin/zkServer.sh restart /opt/zookeeper/conf/server.configuration
TimeoutSec=30
Restart=on-failure


[Install]
WantedBy=default.target
```

{% endtab %}

{% tab title="Others" %}

```
sudo vi /etc/systemd/system/zookeeper.service


[Unit]
Description=Zookeeper Daemon
Documentation=http://zookeeper.apache.org
Requires=network.target
After=network.target


[Service]
Type=forking
WorkingDirectory=/opt/zookeeper
User=zk
Group=zk
ExecStart=/opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/server.configuration
ExecStop=/opt/zookeeper/bin/zkServer.sh stop /opt/zookeeper/conf/server.configuration
ExecReload=/opt/zookeeper/bin/zkServer.sh restart /opt/zookeeper/conf/server.configuration
TimeoutSec=30
Restart=on-failure


[Install]
WantedBy=default.target
```

{% endtab %}
{% endtabs %}

Then start and enable Zookeeper :

```
systemctl start zookeeper
systemctl enable zookeeper
```

## Kafka

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
cd /opt
wget http://apache-mirror.8birdsvideo.com/kafka/2.3.0/kafka_2.11-2.3.0.tgz
tar zxf kafka_2.11-2.3.0.tgz
rm -f kafka_2.11-2.3.0.tgz
ln -s /opt/kafka_2.11-2.3.0/ /opt/kafka
```

{% endtab %}

{% tab title="Others" %}

{% endtab %}
{% endtabs %}

### Configuration

Modify/add the following lines in Kafka configuration file :

{% tabs %}
{% tab title="CentOS7/8" %}

```
vi /opt/kafka/config/server.properties


broker.id=0
listeners=PLAINTEXT://localhost:9092
log.retention.hours=24
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
```

{% endtab %}

{% tab title="Others" %}

{% endtab %}
{% endtabs %}

### Systemd script

{% tabs %}
{% tab title="CentOS7/8" %}

```
sudo vi /etc/systemd/system/kafka.service


[Unit]
Description=Apache Kafka server (broker)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target remote-fs.target
After=network.target remote-fs.target zookeeper.service


[Service]
Type=simple
User=zk
Group=zk
Environment=JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh


[Install]
WantedBy=multi-user.target
```

{% endtab %}

{% tab title="Others" %}

{% endtab %}
{% endtabs %}

Then start and enable Kafka :

```
systemctl start kafka
systemctl enable kafka
```

## Maven

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
cd /opt
wget http://apache.mirrors.ionfish.org/maven/maven-3/3.6.2/binaries/apache-maven-3.6.2-bin.tar.gz
tar xzf apache-maven-3.6.2-bin.tar.gz
rm -f apache-maven-3.6.2-bin.tar.gz
ln -s /opt/apache-maven-3.6.2/ /opt/maven
export M2_HOME=/opt/maven
export PATH=${M2_HOME}/bin:${PATH}
```

{% endtab %}
{% endtabs %}

## Flink

### Installation

{% tabs %}
{% tab title="CentOS7/8" %}

```
cd /opt
wget http://apache-mirror.8birdsvideo.com/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz
tar zxf flink-1.9.0-bin-scala_2.11.tgz
rm -f flink-1.9.0-bin-scala_2.11.tgz
ln -s /opt/flink-1.9.0/ /opt/flink
export FLINK_HOME=/opt/flink/
export PATH=$PATH:$FLINK_HOME/bin
```

{% endtab %}
{% endtabs %}

## Redis

### Installation

```
yum install redis
```

### Configuration

Modify/add the following lines in Redis configuration file :

```
cp /etc/redis.conf /etc/redis.conf.orig
sudo vi /etc/redis.conf


bind 127.0.0.1
port 6379
```

Modify system parameters :

{% tabs %}
{% tab title="CentOS7/8" %}

```
sudo vi /etc/sysctl.conf

vm.overcommit_memory=1


# Then launch
echo never > /sys/kernel/mm/transparent_hugepage/enabled
```

{% endtab %}

{% tab title="Others" %}

{% endtab %}
{% endtabs %}

Then start and enable Redis :

```
systemctl start redis
systemctl enable redis
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://botes.gitbook.io/botes-dataset/botes-enrichement/poc-prerequisites.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
