OpenSearch is one of the main components of the Logs Data Platform, regarded as one of the most powerful search and analytics engines. From the outset we offered the possibility to host an OpenSearch Dashboards index for your OpenSearch Dashboards' metadata, Index As A Service being the next step to this functionality. You can now use a fully unlocked index for almost any purpose; be it complex documents, reports, or even logs. Thanks to the OpenSearch API, you can use most of the tools of the OpenSearch Ecosystem.
Requirements
- an activated Logs Data Platform account
- access to port 9200 of your cluster (you can find the address of your cluster in the Logs Data Platform area of the OVHcloud Control Panel)
Instructions
First steps with an OpenSearch index
Create an index
There are two ways to create an OpenSearch Index:
- Use the Logs Data Platform manager.
- Use the OpenSearch API.
To create an OpenSearch index with the Logs Data Platform manager, you need to go to the Index
tab and click on the + Add an index
on the OpenSearch index section.
You must choose a suffix for your index. The final name will follow this convention:
logs-<username>-i-<suffix>
.
For each index, you can specify the number of shards. A shard is the main component of an index. Its maximum storage capacity is set to 25 GB (per shard). Having multiple shards means more volume, more parallelism in your requests, and thus more performance. Optionally, you can also be notified when your index is close to its critical size. Once your index is created, you can use it right away.
Once you have made your selections, click Save
.
When you create an index through the OpenSearch API, you can specify the number of shards. Note that the maximum number of shards per index is limited to 16. OpenSearch-compatible tools can now create indices on the cluster as long as they follow the naming convention logs-<username>-i-<suffix>
. Here is an example with a curl command with the user logs-ab-12345 and the index logs-ab-12345-i-another-index on VIN
cluster.
There is more information about the API Support at the dedicated section.
Whatever method you use, you will be able to query and visualize your documents on Logs Data Platform using the API.
Index some data
Logs Data Platform OpenSearch indices are compatible with the OpenSearch REST API. Therefore, you can use simple http requests to index and search your data. The API is accessible behind a secured https endpoint with mandatory authentication. We recommend that you use tokens to authenticate yourself. You can retrieve the endpoint of the API at the Home page of your service. Here is a simple example to index a document with curl with an index on the cluster <your_cluster>.logs.ovh.com
.
Here is a quick explanation of this command:
- The PUT HTTP command can be used to create or modify a document.
- The
Content-Type: application/json
is the mandatory header to indicate that the data will be in the json format. - The address contains the endpoint of the cluster followed by the name of your index
- The test just after the index name is the type of the document.
- The 1 here is the id of your document that can be any string.
- The payload of the request is a simple JSON document that will be indexed.
This command will return with a simple payload indicating if the document has been indexed by all the shards involved.
Search your data
There are multiple ways to search your data, this is one area where the OpenSearch REST API excels. You can either get your data directly by using a GET request, or search it with the Search APIs. To get your document indexed previously, use the following curl request:
To issue a simple search you can either use the Query DSL or a URI search. Here is a simple example with an URI search:
Use case: Enrich Logs Data on the fly
The following shows how your e-commerce application logs can be sent to the Logs Data Platform whenever a product is ordered. The application doesn't fetch the full name of the client or other information from the customer database just to produce a log. You can add this information on the fly by using an OpenSearch Index and a Logstash collector on the Logs Data Platform.
Fill an index with clients information
The first thing to do is to index some clients information. The snippet below is one entry of the client index.
To index several documents at once, it is more efficient to use the bulk api. Here is a small snippet of three users you can use to test it.
A bulk request is a succession of JSON objects with this structure:
action_and_meta_data\n optional_source\n action_and_meta_data\n optional_source\n ... action_and_meta_data\n optional_source\n
You can in one request ask OpenSearch to index, update, or delete several documents. Save the content of the previous JSON lines in a file named bulk and use the following call to index these three users:
This call will take the content of the bulk file and execute each index operation. Note that you have to use the option --data-binary and no -d to preserve the newline after each JSON. You can check that your data are properly indexed with the following call:
This will give you back the documents of your index:
Now that you have some data, you can enrich your logs with it. For this we will use a Logstash collector and an elasticsearch plugin (some elasticsearch tools are compatible with OpenSearch).
Configure a Logstash collector
If you don't know how to create a Logstash collector, please refer to the Logstash guide. Edit the configuration of Logstash. For this example we will use a SSL TCP input with the GELF codec. Here is the input configuration.
The most important part in this configuration is the filter part:
The filter part is composed by two plugins, the elasticsearch plugin and the mutate plugin. The elasticsearch plugin has the following configuration:
- hosts: This is the address of the OpenSearch API of your LDP cluster. Note that we use https here.
- index: This is the name of the index containing your static data.
- username: This is the username to authenticate yourself against the API. Again, we recommend that you use tokens for that.
- password: The password of the user.
- enable_sort: this setting tells that there is no need to sort the data for the request.
- query: This is the query issued. Here the query is a simple string query searching for the document having the field userId set at the value userId found in the log event. %{[userID]} will be replaced by the value contained in the field userId of the log event.
- fields: This is where the magic happens. The field of the document found will be added to the event. The field of the document is on the left and the new (or updated) field of the event is on the right. Be sure to follow the field naming conventions.
The mutate plugin is here to show you how you can combine different subfield information in one top level field. Here we combine a latitude and a longitude field to create a geolocation field then we remove the original address top-field.
Send and retrieve your logs
One simple way to test your new Logstash configuration is to send a log by using echo and openssl.
Check the examples below:
As you can see we just specify the userId this order belong to. Sending this log to your Logstash input will give you the following final log:
The log has been enriched with the fields we declared in our filter automatically. Linking information from an index and the logs allow you to create more meaningful Dashboards based on these information:
In this Dashboard, you can see that the first widget is a "quick values" widget based on the firstName fields of the logs we retrieved.
Monitor the Index Size
The maximum size of your index is fixed and is dependent on the number of shards. Shards are the unit of parallelism in OpenSearch, so if search performance is critical, you should choose an index with the highest number of shard you can afford. Thanks to the high performance nodes we use, we managed to send thousands of logs to the Logstash and enrich all of them within seconds using only one shard.
Note that you can monitor yourself the size of the index by using the following curl query:
This command will give you a document with the following format:
The size in bytes used to compute your billing is the one under the following path: "indices" -> "logs--i-" -> "primaries" -> "store" -> "size_in_bytes".
Management through OpenSearch API
On Logs Data Platform, we allow users to use OpenSearch API to handle the lifecycle of their indices. You can create and delete indices directly with the OpenSearch API. You can also create aliases and them. We even support templates to allow users to create their mapping a the creation of the index automatically!
Index creation and deletion
To create an index on Logs Data Platform, use the following call:
- The -u option is followed by your LDP username that you can find on Home page. The password 'mypassword' follows it after the separator ':'
- The PUT HTTP command can be used to create or modify a document.
- The -H 'Content-Type: application/json' option is the mandatory header to indicate that the data will be in the json format.
- The address contains the endpoint of the cluster followed by the name of your index
- The payload of the request is a JSON document which contains the settings of your index: the number of shards (the number of replicas will be automatically set at 1).
You have to follow the Logs Data Platform naming convention <username>-i-<your-suffix>
to create your index. Your username is the one you use to connect to Graylog or to use the API. The suffix can contain any alphanumeric character.
To delete a index use the following call:
Here we use the DELETE HTTP command to delete the index.
Alias creation and deletion
Like indices, you can use the API Calls to delete and create aliases on your indices. The only difference is the convention for the name of your alias. Your alias must be formatted as the following <username>-a-<suffix>
. Here is an example call:
This call creates an individual alias on one index you have previously created.
If you need more information about aliases, you can check the OpenSearch documentation.
We also support the aliases API:
The actions below will do the following:
- remove Alias-A from Index-A
- add Alias-A to Index-B
- remove Index-A
All the actions (alias change, alias creation and index deletion) will be done in a single call. All the indices and aliases involved must follow the convention, otherwise an error will be thrown.
Templates
Logs Data Platform supports your custom templates. As for indices and aliases, the template must follow some rules in order for them to work:
- the template name must contain your
<username>
inside the name. It can be anywhere in the name string. - The prefix of the indices involved in the template MUST start by
<username>-i-
, the "*" character must be after this prefix - The alias attached to your template must follow the usual convention:
<username>-a-<suffix>
Here is an example of a template for a user logs-ab-12345:
This template will be applied for every new index matching the index pattern.
Manager
All the items you create through OpenSearch API will be displayed in your manager and can be deleted or monitored through it.
Here the first index was create through API, its description was filled automatically.
Additional Information
Index as a service has some specificities on our platforms. This additional and technical information can help you to use it properly:
- Replication is set at 1 and cannot be changed. We ensure the high availability of your index in case of a hardware failure.
- The maximum size of your index is fixed and is dependent on the number of shards. If search performance is critical, you should choose an index with the highest number of shard you can afford.
- The index_refresh_interval of the index is set at 1 second ensuring near real time search results.
- You are not allowed to change the settings of your index.
- You can create an alias on Logs Data Platform and attach it to one or several indices.
- Unlike indices, aliases are read-only, you cannot write through an alias yet.
Go further
For more information and tutorials, please see our other Logs Data Platform support guides or explore the guides for other OVHcloud products and services.