abstract | excerpt | authors | completed_date | last_updated | components | draft | keywords | primary_tag | related_content | title | subtitle | tags | translators | type | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
This tutorial shows you how to get unstructured JSON data from an API, store it in CockroachDB hosted on Red Hat Marketplace, and query the unstructured JSON data from the table using Python and a Jupyter notebook. |
This tutorial shows you how to get unstructured JSON data from an API, store it in CockroachDB hosted on Red Hat Marketplace, and query the unstructured JSON data from the table using Python and a Jupyter notebook. |
|
2020-09-03 |
2020-09-03 |
|
true |
cockroachdb, red hat marketplace |
databases |
|
Store and query unstructured JSON data from CockroachDB hosted on Red Hat Marketplace |
Access JSON data with CockroachDB and OpenShift |
|
|
tutorial |
This tutorial shows you how to get unstructured JSON data from an API, store it in CockroachDB hosted on Red Hat Marketplace, and query the unstructured JSON data from the table using Python and Jupyter notebooks.
CockroachDB is an elastic SQL database that easily scales transactions for your apps and services. Its cloud-native architecture simplifies scale and also guarantees consistent transactions across multiple regions and multiple clouds. Learn more.
When you have completed this tutorial, you will understand how to:
- Install CockroachDB Operator from Red Hat Marketplace on an OpenShift Cluster
- Create a CockroachDB cluster instance
- Create a user and database in CockroachDB
- Store and query unstructured JSON data from a third-party API in CockroachDB
- Manage the CockroachDB cluster from the Cluster Overview page
Completing this tutorial should take about 30 minutes.
- Red Hat Marketplace account.
- Red Hat OpenShift Cluster.
- OpenShift container & kubectl CLI.
- Access to a Jupyter Notebook. You can install a Jupyter Notebook from python-pip or use a tool such as Anaconda to open the Jupyter Notebook.
Follow the steps in this tutorial to deploy a CockroachDB Operator from Red Hat Marketplace on an OpenShift Cluster:
Once you have successfully set up a CockroachDB Operator on an OpenShift Cluster you can create a database.
Now, let's create a user
and a database
.
-
Run the following command to spin up a CockroachDB client:
$ kubectl run -it --rm cockroach-client \ --image=cockroachdb/cockroach \ --restart=Never \ --command -- \ ./cockroach sql --insecure --host=example-cockroachdb-public.cockroachdb-test
This should run the CockroachDB client and take you to a
SQL Command Prompt
as shown. If you don't see a command prompt, try pressing Enter.[email protected]:26257/defaultdb>
-
From the CockroachDB client, run the following commands:
- Create a user
maxroach
as follows:
[email protected]:26257/defaultdb> CREATE USER IF NOT EXISTS maxroach; CREATE USER 1 Time: 9.580878ms
- Create a database
employees
as follows:
[email protected]:26257/defaultdb> CREATE DATABASE employees; CREATE DATABASE Time: 14.449525ms
- Give the user,
maxroach
, permission to update our database,employees
, as follows:
[email protected]:26257/defaultdb> GRANT ALL ON DATABASE employees TO maxroach; GRANT Time: 9.308095ms
At this point, you should have a
user
and adatabase
. - Create a user
-
Type
\q
to quit the client console as shown:[email protected]:26257/defaultdb> \q pod "cockroach-client" deleted
The following steps show you how to view the results of the commands you ran in the earlier steps via the admin console
. You can access the console at localhost with port forwarding.
Note: You need to be logged in to your OpenShift Cluster with the OC login that you accessed in Step 2: Connect to the OpenShift cluster in your CLI.
-
Run the following command to port forward
8080
:$ kubectl port-forward example-cockroachdb-0 8080
Forwarding from 127.0.0.1:8080 -> 8080 Forwarding from [::1]:8080 -> 8080
-
Visit http://localhost:8080 on your browser as shown. The page should load the cluster overview.
-
Click on
databases
to view theemployees
database that you created earlier.
Once the CockroachDB is running and you created the user and database, you can now explore the JSON support for CockroachDB in a python runtime using Jupyter Notebook.
-
In your terminal, run the following command to port forward the
26257
port from the CockroachDB database instance. This port is used in your Jupyter Notebook to establish a connection with the CockroachDB database instance.$ kubectl port-forward example-cockroachdb-0 26257
Forwarding from 127.0.0.1:26257 -> 26257 Forwarding from [::1]:26257 -> 26257
-
Download and open the notebook Store-and-query-unstructured-Json-CockroachDB.ipynb in your local machine.
We are using the http://dummy.restapiexample.com/api/v1/employees API for demonstration purpose.
-
Click on the Cell tab and select Run All.
You can now follow the notebook instructions for more details on what is happening in each cell.
After you have executed the notebook, verify the table in the CockroachDB instance through the CockroachDB client from terminal.
-
In your terminal, run the following command to spin up a CockroachDB client:
$ kubectl run -it --rm cockroach-client \ --image=cockroachdb/cockroach \ --restart=Never \ --command -- \ ./cockroach sql --insecure --host=example-cockroachdb-public.cockroachdb-test
This should run the CockroachDB client and take you to a
SQL Command Prompt
as shown. If you don't see a command prompt, try pressing Enter.[email protected]:26257/defaultdb>
-
From the CockroachDB client, run the following commands to view the
user
,database
, andtable
which the Jupyter Notebook created:-
View
users
by running theSHOW users;
command:[email protected]:26257/defaultdb> SHOW users; user_name `-------------` cpuser maxroach root (3 rows) Time: 3.037641ms
-
View
databases
by running theSHOW databases;
command:[email protected]:26257/defaultdb> SHOW databases; database_name `-----------------` bank defaultdb employees postgres system (5 rows) Time: 2.890031ms
-
To view the tables present in the Employees database, run the
USE bank;
command to switch tobank
database, and run\d
command to view thetables
:[email protected]:26257/defaultdb> USE employees; SET Time: 11.83841ms [email protected]:26257/employees> \d table_name `----------------------` jsontbl (1 row) Time: 3.684617ms
-
-
Finally, to view the unstructured JSON from the table, run the
SELECT
command as follows:
<pre><code>[email protected]:26257/employees> <b>SELECT * from jsontbl;</b>
</code></pre>
![cmd-output](doc/source/images/cmdoutput.png)
In this tutorial, we showed you how to get unstructured JSON data from a third-party REST API, created a table in CockroachDB and store the unstructured JSON data into the table. You also learned how to query the unstructured data from CockroachDB table into a pandas dataframe which can be further used for data engineering and data science tasks.
Refer to the following documentation from Cockroach labs to learn more about the operator and its features.