Transforming CSV in WSO2 MI— The Easy Way

Dilan Tharaka
8 min readJan 3, 2021

--

CSV is a widely used content format in the digital world. In WSO2 MI you can use a number of mediators to transform a CSV payload. DataMapper, PayloadFactory, and Script Mediator are some of them. But now you have an even simpler and powerful way to transform your CSV payloads.

Previously in the tool pallet of the Integration Studio, you only had some pre-defined mediators. But with this new feature called modules, you can add more tools (Modules) into the tool pallet. Adding a module to Integration Studio is as easy as adding a connector.

For an easy transformation of CSV, WSO2 introduces a new module called CSV Module. This article is all about how to use the CSV Module.

The CSV module supports the following transformation,

  • CSV to CSV
  • CSV to JSON
  • CSV to XML
  • JSON to CSV
  • XML to CSV

In this article, I’m showing you how to perform a CSV to JSON transformation using the CSV module. To do that, I’m using a simple REST API. The easiest way to create an API in WSO2 MI is to use the Integration Studio getting started page.

Generating the REST API

First, You have to update the Integration Studio to the latest version. If you want to know how to get the latest updates, check out the docs.

The Integration Studio latest version comes up with lots of really cool features 😍, The all-new getting started page is one of them.

Integration Studio Getting Started Page

Now you can easily create the project you need or start with a sample using this getting started page.

For this example, I’m creating a Rest API using the Hello World Service example in the getting started page.

To create the API project, click on the Hello World Service sample on the getting started page,

Give a name for your project and click Finish.

The Hello World Service example is a sample REST API project. You can see the Synapse configs of HelloWorld API in the design view bellow,

Now, we’ll see how to import the CSV Module into our workspace.

Importing the CSV Module

The first step of importing a module is to create a connector exporter project. Right-click on the root project (csvTransformer) and in the menu click on, new -> Connector Exporter

In the New Connector Exporter Project wizard, give a name for the connector exporter project and click Finish. Now, in Project Explorer, you can see the newly added connector exporter project. Then, right-click on the configs project (csvTransformerConfigs) and click on the Add or Remove Connectors/Modules from the menu.

In the Add or Remove Connectors/Modules wizard, click on the Add connector/module option (this is the default option) and click Next. Then, in the search box, type CSV, and click on the search button. Now, you can see the CSV Module in the search results. Click on the download button to add the Module to the project. After the module is downloaded. Close the wizard by clicking the Finish button.

Then, right-click on the connector exporter project and click on the menu, New -> Add/Remove Connectors.

In the Add or Remove Connectors/Modules wizard, click on the Add connector/module option (this is the default option) and click Next.

Then click on the Workspace button to import the module from the workspace. In there, select the CSV Module and click OK. Then click on Finish to finish the wizard.

Now, in the Design view, we’ll remove the Payload Factory mediator (since we don’t need that for this scenario) by clicking on it and pressing the Delete key in the keyboard.

Now in the Tool Palette you can see the CSV Module group.

This group has all the operations supported by the CSV Module. Now, drag and drop the csvToJson operator into the design view before the Response mediator. After doing that, your design view should look like following,

Great, we have successfully added the CSV Module to the project and added the csvToJson operator into the design view. Our next step is to config the operator. Before that, I’ll explain the transformation that I’m trying to do here.

CSV to JSON Transformation

I have the following CSV input,

id,name,email,phone_number
1,De witt Hambidge,dwitt0@newsvine.com,true
2,Brody Dowthwaite,bdowthwaite1@delicious.com,false
3,Catlin Drought,cdrought2@etsy.com,608-510-7991
4,Kissiah Douglass,kdouglass3@squarespace.com,true
5,Robinette Udey,rudey4@nytimes.com,true

And I need to transform this into the following JSON,

{
"results": [
{
"index": 2.0,
"name": "Brody Dowthwaite",
"email": "bdowthwaite1@delicious.com"
},
{
"index": 3.0,
"name": "Catlin Drought",
"email": "cdrought2@etsy.com"
},
{
"index": 4.0,
"name": "Kissiah Douglass",
"email": "kdouglass3@squarespace.com"
},
{
"index": 5.0,
"name": "Robinette Udey",
"email": "rudey4@nytimes.com"
}
]
}

Let’s see how to do this with the CSV to JSON operator in the CSV Module.

Configuring the operator

Click on the CSV to JSON operator that we added to the design view in the previous step. Now you can do the configurations in the Properties view.

The first thing we have to configure is the Header property under the CSV Input group. We need to set this to Present using the list box. By setting it to present, we are telling the CSV Module that our CSV has a header. Then it will map that header with the CSV input payload and we can use the header names to specify columns in other configurations.

Under the CSV Transformation group, you can now see the property Skip Headers after selecting the value Present to the Header property. Select value True for that property since we don’t need the header row of the input CSV in the output JSON.

In the output JSON, We don’t need the column phone_number. To skip that, we can use the property Skip Columns. In that property, we can specify the columns as a list ( 1,2), as a range (1:3), or as a combination of both ( 1,2,5:10). And since we have set the Header property to Present we can give the column names rather than column index. Therefore, to skip the column phone_number we can set the value of the Skip Columns as “phone_number”. Note that we have to use double quotes if we are giving the names of the columns.

If you observe the output JSON, not only the header row but the first data row in the CSV is omitted. This can be easily done with the CSV Module using the property Skip Data Rows. You can give the number of data rows to skip in this property. In this case, it’s 1.

Another change that I need in the output is that, Instead of using the CSV headers, I need different keys for the JSON. Since we have said that we have a header in the CSV, the default behavior of the CSV Module is to use those headers as the keys of the output JSON. But if we need our own keys, then we can specify them as a comma-separated list in the JSON Keys property. In this case, we are setting it as index,name,email.

In CSV, we don’t have a way to define data types. All the values in a CSV payload are string values. But in a JSON we have the ability to use different data types such as string, numbers, and boolean. The CSV Module supports this data type conversion. We can use the Data Types property to set the desired data types. If we don’t specify the data types, the CSV Module considers them as String data.

As you can see in the above image, the Data types property is a table (You need to have the updated Integrations Studio to have this feature). You can add a new entry to the table by clicking the + button in the upper right corner. For our transformation, we need the id column in the CSV to be a number in the output JSON. We can specify that as given below,

Finally, have you notices that the output JSON is not just a transformation of the CSV input. We have to wrap the results in a JSON object and specify the values under a key of results. This is a common scenario in most of the transformations. We can do this easily using the CSV Module by specifying the key of the wrapper object in the Root JSON Key property.

When we have set all the properties, the final properties view is like bellow,

You can specify these properties in the source view also. The source view with the above configs are like bellow,

<CSV.csvToJson>
<headerPresent>Present</headerPresent>
<skipHeader>true</skipHeader>
<columnsToSkip>"phone_number"</columnsToSkip>
<dataRowsToSkip>1</dataRowsToSkip>
<csvEmptyValues>Null</csvEmptyValues>
<jsonKeys>index,name,email</jsonKeys>
<dataTypes>[{"Column Name Or Index":"id","Is Column Name":"Yes","Data Type":"Number"}]</dataTypes>
<rootJsonKey>results</rootJsonKey>
</CSV.csvToJson>

Now we are all set 😎. We can test this by running the project and sending the CSV payload into the API like bellow,

That’s how to transform a CSV to JSON in WSO2 MI ( Or EI ) using the CSV Module. The CSV Module can be used to do more transformations as I said before. As you can see, we didn’t use any code. That’s the main advantage of using the CSV Module. You have to write zero code lines to perform a complex transformation. So, consider using the CSV Module for your next payload transformation. Don’t think twice to comment if you have any hesitation 📝

--

--

Dilan Tharaka

Software engineer curious about how things actually work