Background

It is widely known in the data industry that most of the work required to create a usable data asset is spent cleaning and organize the raw data. Data from different systems is messy and rarely follows a standard structure. 

In Numetric, we've created a set of tools that significantly reduce the time required for these types of cleaning tasks. The tools themselves (the actual operations that can be performed on the data) are nothing revolutionary. They include standard filtering, if-then logic, value mapping functions, and other typical transformation formulas. Instead, what makes Numetric’s cleaning experience so special is the placement (at the entry point into the system) and reusability (because of our rule-based approach) of our tools.

Rule-Based Cleaning

The Numetric approach to cleaning centers on rules that you can establish for each individual Table. Those rules are saved and reapplied every time new data from that source enters the Source Table. The result is a set-it-and-forget-it process that should only have to be done once and will ensure that your Tables always have the latest, cleaned and organized data.

Source Tables vs. Tables

It's important to distinguish between Source Tables and Tables within Numetric. The list of Source Tables simply allows you to manage the import and regular updating of external sources of data. Each Source Table that is checked (active) under the Source List will be ingested into Numetric and show up in the list of Tables. These Tables will be included in the data preparation workflow. To clean and otherwise transform your data, you can apply transformation rules to the imported Tables. (Note: any changes applied to Tables are non-permanent and will not actually change the Source Table itself.)

Think of Sources as a sort of staging area for data that you might want to bring into Numetric. Some data connectors will give you access to many tables from the external system, and often you'll only need a few tables from the many available. 

Cleaning Tables

The way Numetric cleans and modifies its data is a little different than most other platforms. Rather than renaming, adjusting, and modifying your data right before you build a data visualization, at Numetric we feel it provides a better experience for our users if we make those adjustments at the beginning of the journey rather than the end. By cleaning the data right after import, it makes the data easier to use and more accessible to everyone. For example, your default column headers may not make much sense to people not familiar with that specific database. By modifying the column names at the Table level, all subsequent users will benefit from that work.

There are three different categories of changes that you can make to your Tables: Transformations, Column Properties, and Table Properties.

Transformations

Transformations make it possible for you to adjust and filter the actual data within your Table. There are four different types of transformations that are available for you to use on your Tables.They are powerful enough to make even complex modifications, and intuitive enough that you’ll feel comfortable making transitions in no time. Those transformations are: formula, if then statement, filter, find and replace. 

        Formula

The formula transformation is a flexible transformation with which you can apply a wide variety of operations to your Tables. There are nearly endless options when it comes to formula transformations, so we won't enumerate all of the possible options here. You can consult our more detailed formula documentation to learn more about your options.

        If Then Statement

The If Then transformation allows conditional operations to be performed with the data. Within an If Then transformation, you can set up a variety of triggers (criteria), each with a potential true or false response. When the trigger (or group of triggers) is true for a given row in the data, Numetric will apply one or more actions (note that Numetric will not take any action if the statement is false)

Numetric also allows you to add multiple triggers and multiple outcome actions. When multiple triggers are provided, you'll need to choose whether to use And/Or logic on the set of triggers. As you might expect, selecting And will only apply the action if all of the trigger criteria are true, and selecting Or will apply the action if any of the trigger criteria are true.

There are two types of actions that can be selected: Set, and Run Formula. Set will allow you to set a custom, static value (either text or number). Selecting Run Formula will allow you to assign a value from another column, or apply another custom formula.

        Filter

The Filter transformation allows you to filter out rows that fit certain criteria, called Triggers. You can filter by removing rows that fit the criteria, or by only keeping rows that fit the criteria. 

        Find and Replace

The Find and Replace transformation allows you to easily search for certain values in specified columns, and easily replace them with corresponding values (in the same column or a different column) with corresponding values.  

Changing Column and Table Properties

Most of the work with Tables happens in the Transformations tab, but it's also important for you to know about the other two tabs: Column Properties and Table Properties. 

        Column Properties

The Column Properties section contains information (metadata) about each of the columns in the selected Table. Here you can reorder the columns in the Table, as well as customize the properties of each column in the Table.

To reorder the display order of the Table columns, you can either drag and drop them to the desired location or change the order number of the column to move it to that specific location in the Table.

To change the metadata for a column, select the column from the sidebar. This will display the column metadata in the center of the screen. Any changes to the Data Type may not be carried out immediately upon updating your Table, these changes may take a few minutes to show in your Table.

Here you can rename the column by changing the text in the Table Column Alias field. The Description field is a free-form text field that allows you to add a description of the data contained in this column. You can adjust the Data Type from the drop-down menu, this will allow you to determine which type of data is contained in the field. Determining the correct Data Type is a crucial step in preparing your data for use in Numetric. Each of the Data Types, along with a brief description of the Data Type are examined in the Data Types article.

The remaining headings all contain general information about the column. This includes the Source heading, which displays the name of the column in the Source Table, the Column Overview, which displays general information about the values in the column, and the Term heading, which will display any Dictionary Terms associated with this column.

When you have made the desired changes to your column properties, click the Publish button in the top right to save or you can click Revert to cancel any changes.

        Table Properties & Sharing

From here you can edit the Table name, and category, as well as see general information about the selected Table. This includes the size of the Table, the total number of rows contained in the Table, and the date that it was updated. The Associated Terms and Related Tables headings display any terms or Tables that have been joined to this Table through the Numetric Dictionary.

One of the main functions of the Table Properties tab is the ability to share the Table with other Numetric users. The sidebar on the left displays all users with access to view this Table. To add a member, click on the New Member drop down, and type or select from the drop down the name of the user you would like to add, then, set their permissions in the field to the right. Options include: Can View, and Can Edit. Users with view permissions can use the Table in a Dataset, but cannot make changes to the actual Table. Users with edit permission can make changes and revisions to the transformations or properties of the Table.

Now What?

Cleaning and formatting data in Tables is the key to having an easy-to-use experience with your data. Now, with your rules set up, each time your Source Table refreshes, pulling new data into Numetric, it will be all ready to go, with no effort on your part.

The next step, the Numetric Dictionary, allows you to build relationships between your various Tables making it easy to look at data from various sources.

Did this answer your question?