exasol, python

How to: Create an EXABucketFS service and bucket

We recently upgraded our Exasol version from 5.0.17 to 6.0.3 and we noticed that our existing Python modules were gone and jobs that we’ve automated were failing. After reading through the updated documentation we figured out Exasol has introduced a cool new feature called EXABucketFS.

After reading through various community questions and YouTube videos I was able to get the hang of uploading modules and linking them up with Exasol.

Here’s a quick guide on how to create an EXABucketFS service and bucket.

Firstly login to your Exasol Instance admin page and click on EXABuckets. Click on the Add button and give it a meaningful description as well as assigning a port. In our example we’ve created bucketfs1. The first bfsdefault service is automatically created.

Below you’d be able to see the size of the bucket, port number as well as what data disk the bucket is stored on.

1.JPG

Now click on the service you’ve created and then create your bucket inside it. Mandatory fields include a meaningful name, read password to access the bucket and a write password to write/upload the files to the bucket.

3

4

The UDF path is important as this is the path we will be using to access the bucket.

Once this has been created download curl. When I started with Curl I had a issue with it just closing then I realised that open command prompt and change the directory to where curl is installed. This will allow you to use it. We’re using curl to communicate with the new bucket that was created.

After you’ve got curl up and running try the following command to insert a test file into the newly created bucket.

curl -X PUT -T filename http: //w:{write password}@{ip address and node number:port}/{bucketname}/filename

To check if the file has been uploaded you can use the following command

curl {database server:port/bucketname}

The output should be everything you’ve uploaded into the bucket. To delete a file use the following command:

curl -X DELETE filename http: //w:{write password}@{ip address and node nuumber:port}/{bucketname}/filename

This is a replicated service so it doesn’t matter what node you write too as it should be accessible by the other nodes too.

Exasol say that they have an LS function available to view everything in the bucket but I was not able to find this so we wrote another function which takes the UDF path as a parameter and outputs everything that is in the bucket. This can be used from EXAplus.

CREATE PYTHON SCALAR SCRIPT "LS" ("my_path" VARCHAR(100) UTF8) EMITS ("FILES" VARCHAR(100) UTF8) AS
import os
def run(c):
names = os.listdir(c.my_path)
for n in names:
c.emit( n )
/

To create a connection to the ExaBucket use the following command

CREATE CONNECTION connectionname TO ' bucket location '
IDENTIFIED BY ' read password '

Now if you want to use the newly uploaded python modules with Exasol these can be imported using the following:

Firstly we need to include information on where the new uploaded python module is located –

sys.path.insert(1, '/buckets/bucketfs1/bucket1/urllib3-1.22/urllib3-1.22/')

Use the following to import the python module –

import urllib3

Let me know how you get on using this new feature,

VGJ

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s