cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Flow Basics: Data Extraction

Flow Basics: Data Extraction

A core feature of the NLP engine is the ability to extract data from user input. This can be done in two ways. Contextually using entities and linearly, using slots.

Slot filling

The easiest way to capture user input is using the any text  trigger. By default this will capture any user input and store it inside a param.

The following example demonstrates a linear flow where we capture a users name.

 

extract-any-text.png

Typed data

Next to capturing any text we provide a way to capture data of a specific type. For example e-mail addresses, dates, phone numbers or a custom list of data.

Matching entity types do not require exact input from a user. For example the sentences my email is foo@bar.com or it's foo@bar.com, would match and extract foo@bar.com.

 

extract-email1.png

As the above example illustrates, you can also combine capturing user input with other trigger types by simply branching them.

Benefits

The main benefits of using the Any text trigger for slot filling are:

  • Does not require any training data
  • It's more accurate in extracting specific data

Contextual

With use cases where you do not have a linear flow you are able to extract data contextually from user input by annotating entities.

For example, when a user sends I want to fly from Amsterdam to San Francisco we want to detect that Amsterdam and San Francisco are a place of departure and arrival.

Entities

Within the training view of any intent you can mark entities by annotating them. To train the entity classifier annotate lots of examples. When you annotate lots of cities for example, the classifier will learn to recognize cities.

 

extract-marked-entities2.png

This means it will also pick up on cities you didn't explicitly marked. Of course you are not limited to cities: you could create a food entity, an animal entity or a movie entity, or something else entirely, you name it!

TIP: Mark and name entities consistently

Make sure you always mark every example. Do not skip any, as this harms the classification process.

Benefits

Although they require more work, contextual entity matching has it's own benefits for advanced use cases:

  • Extract data non-linear with any input
  • Extract multiple data contextually, like departure and destinations

Validation and formatting

We allow the usage of various entity types that allow validation and data transformation. For example the system will convert any entity containing tomorrow to a UTC datetime format if it's a date entity type.

System entity types

We support the following system entity types

Name  Example
Text This is the most used entity type. The AI will match anything you train it using your examples. If you'd like to match music artists, train the AI with examples like Madonna, Michael Jackson, The Weeknd etc
Date Tomorrow, next week, 1e of July
Time 12am, 23:59, now, tomorrow at 8am
Number 2453. twenty one
Email john@doe.com
URL examplecorp.com
Distance 200 meters, 12 miles, 1.5cm
Money 42€, $99
Text Anything from names to cities.

 

Custom entity type

At the moment we only support one custom type: List entity type.

List entity types work in a similar way as the Text entity type. There is one important difference. Lists provide a way to give boundaries to what the AI will match.

Let's say we take the above example of a departure and destination city. Perhaps you only want a select number of cities a person can travel to. You can limit these options by creating a custom entity list type named city. Within this entity type you provide only matched city names that are valid to travel to.

 

examples-entities-city3.png

When the AI detects an entity of this custom entity type it will determine if it matches. The AI could find an entity that is not valid in relation to the type of entity, For example, I want to travel to Washington could be matched in the above example, but since Washington is not in our list, it's immediately dropped

Tip: Matching is fuzzy

No need to add typos as synonyms, matching is case insensitive and works with grammar errors

Exact matching

By default list entity types will do an exact match. This means, it will only match an extracted value if it is actually present inside the list. When you turn of exact matching, custom entities will actually work the same way as a regular Text entity type works.

Using extracted data

Any extracted data is present within params. Those params can be used within Code, webhooks and string templates.

The following example shows a try-out window that displays extracted data within the right side bar.

 

extract-tryout-params4.png

Params always come in the form of an array, even when only one match is found.

The following shows an example of an extracted entity within a param named destination.

// Example payload inside cloud code with a destination match
{
    params: {
        destination: [{
            match: "NYc",
            value: "New York",
            id: "NY"        
        }]
    }
}

Multiple destinations would look like

// Example payload inside cloud code with a 2 destination matches
{
    params: {
        destination: [{
            match: "NYc",
            value: "New York",
            id: "NY"
        },{
            match: "Amsterdam",
            value: "Amsterdam",
            id: "AMS"
        }]
    }
}

Filling user profile attributes

User profile attributes can be set using params by giving them a system defined name.

The following profile data can be filled:

  • user.name
  • user.profile.fullName
  • user.profile.firstName
  • user.profile.lastName
  • user.profile.gender (M/F/U)
  • user.profile.locale
  • user.profile.timezone (offset from UTC, -1)
  • user.profile.email
  • user.profile.picture (url)

Read more

We've written a number of articles about use cases for extracted data:

Tags (1)

Flow Basics: Data Extraction

A core feature of the NLP engine is the ability to extract data from user input. This can be done in two ways. Contextually using entities and linearly, using slots.

Slot filling

The easiest way to capture user input is using the any text  trigger. By default this will capture any user input and store it inside a param.

The following example demonstrates a linear flow where we capture a users name.

 

extract-any-text.png

Typed data

Next to capturing any text we provide a way to capture data of a specific type. For example e-mail addresses, dates, phone numbers or a custom list of data.

Matching entity types do not require exact input from a user. For example the sentences my email is foo@bar.com or it's foo@bar.com, would match and extract foo@bar.com.

 

extract-email1.png

As the above example illustrates, you can also combine capturing user input with other trigger types by simply branching them.

Benefits

The main benefits of using the Any text trigger for slot filling are:

  • Does not require any training data
  • It's more accurate in extracting specific data

Contextual

With use cases where you do not have a linear flow you are able to extract data contextually from user input by annotating entities.

For example, when a user sends I want to fly from Amsterdam to San Francisco we want to detect that Amsterdam and San Francisco are a place of departure and arrival.

Entities

Within the training view of any intent you can mark entities by annotating them. To train the entity classifier annotate lots of examples. When you annotate lots of cities for example, the classifier will learn to recognize cities.

 

extract-marked-entities2.png

This means it will also pick up on cities you didn't explicitly marked. Of course you are not limited to cities: you could create a food entity, an animal entity or a movie entity, or something else entirely, you name it!

TIP: Mark and name entities consistently

Make sure you always mark every example. Do not skip any, as this harms the classification process.

Benefits

Although they require more work, contextual entity matching has it's own benefits for advanced use cases:

  • Extract data non-linear with any input
  • Extract multiple data contextually, like departure and destinations

Validation and formatting

We allow the usage of various entity types that allow validation and data transformation. For example the system will convert any entity containing tomorrow to a UTC datetime format if it's a date entity type.

System entity types

We support the following system entity types

Name  Example
Text This is the most used entity type. The AI will match anything you train it using your examples. If you'd like to match music artists, train the AI with examples like Madonna, Michael Jackson, The Weeknd etc
Date Tomorrow, next week, 1e of July
Time 12am, 23:59, now, tomorrow at 8am
Number 2453. twenty one
Email john@doe.com
URL examplecorp.com
Distance 200 meters, 12 miles, 1.5cm
Money 42€, $99
Text Anything from names to cities.

 

Custom entity type

At the moment we only support one custom type: List entity type.

List entity types work in a similar way as the Text entity type. There is one important difference. Lists provide a way to give boundaries to what the AI will match.

Let's say we take the above example of a departure and destination city. Perhaps you only want a select number of cities a person can travel to. You can limit these options by creating a custom entity list type named city. Within this entity type you provide only matched city names that are valid to travel to.

 

examples-entities-city3.png

When the AI detects an entity of this custom entity type it will determine if it matches. The AI could find an entity that is not valid in relation to the type of entity, For example, I want to travel to Washington could be matched in the above example, but since Washington is not in our list, it's immediately dropped

Tip: Matching is fuzzy

No need to add typos as synonyms, matching is case insensitive and works with grammar errors

Exact matching

By default list entity types will do an exact match. This means, it will only match an extracted value if it is actually present inside the list. When you turn of exact matching, custom entities will actually work the same way as a regular Text entity type works.

Using extracted data

Any extracted data is present within params. Those params can be used within Code, webhooks and string templates.

The following example shows a try-out window that displays extracted data within the right side bar.

 

extract-tryout-params4.png

Params always come in the form of an array, even when only one match is found.

The following shows an example of an extracted entity within a param named destination.

// Example payload inside cloud code with a destination match
{
    params: {
        destination: [{
            match: "NYc",
            value: "New York",
            id: "NY"        
        }]
    }
}

Multiple destinations would look like

// Example payload inside cloud code with a 2 destination matches
{
    params: {
        destination: [{
            match: "NYc",
            value: "New York",
            id: "NY"
        },{
            match: "Amsterdam",
            value: "Amsterdam",
            id: "AMS"
        }]
    }
}

Filling user profile attributes

User profile attributes can be set using params by giving them a system defined name.

The following profile data can be filled:

  • user.name
  • user.profile.fullName
  • user.profile.firstName
  • user.profile.lastName
  • user.profile.gender (M/F/U)
  • user.profile.locale
  • user.profile.timezone (offset from UTC, -1)
  • user.profile.email
  • user.profile.picture (url)

Read more

We've written a number of articles about use cases for extracted data:

Tags (1)

Flow Basics: Data Extraction

A core feature of the NLP engine is the ability to extract data from user input. This can be done in two ways. Contextually using entities and linearly, using slots.

Slot filling

The easiest way to capture user input is using the any text  trigger. By default this will capture any user input and store it inside a param.

The following example demonstrates a linear flow where we capture a users name.

 

extract-any-text.png

Typed data

Next to capturing any text we provide a way to capture data of a specific type. For example e-mail addresses, dates, phone numbers or a custom list of data.

Matching entity types do not require exact input from a user. For example the sentences my email is foo@bar.com or it's foo@bar.com, would match and extract foo@bar.com.

 

extract-email1.png

As the above example illustrates, you can also combine capturing user input with other trigger types by simply branching them.

Benefits

The main benefits of using the Any text trigger for slot filling are:

  • Does not require any training data
  • It's more accurate in extracting specific data

Contextual

With use cases where you do not have a linear flow you are able to extract data contextually from user input by annotating entities.

For example, when a user sends I want to fly from Amsterdam to San Francisco we want to detect that Amsterdam and San Francisco are a place of departure and arrival.

Entities

Within the training view of any intent you can mark entities by annotating them. To train the entity classifier annotate lots of examples. When you annotate lots of cities for example, the classifier will learn to recognize cities.

 

extract-marked-entities2.png

This means it will also pick up on cities you didn't explicitly marked. Of course you are not limited to cities: you could create a food entity, an animal entity or a movie entity, or something else entirely, you name it!

TIP: Mark and name entities consistently

Make sure you always mark every example. Do not skip any, as this harms the classification process.

Benefits

Although they require more work, contextual entity matching has it's own benefits for advanced use cases:

  • Extract data non-linear with any input
  • Extract multiple data contextually, like departure and destinations

Validation and formatting

We allow the usage of various entity types that allow validation and data transformation. For example the system will convert any entity containing tomorrow to a UTC datetime format if it's a date entity type.

System entity types

We support the following system entity types

Name  Example
Text This is the most used entity type. The AI will match anything you train it using your examples. If you'd like to match music artists, train the AI with examples like Madonna, Michael Jackson, The Weeknd etc
Date Tomorrow, next week, 1e of July
Time 12am, 23:59, now, tomorrow at 8am
Number 2453. twenty one
Email john@doe.com
URL examplecorp.com
Distance 200 meters, 12 miles, 1.5cm
Money 42€, $99
Text Anything from names to cities.

 

Custom entity type

At the moment we only support one custom type: List entity type.

List entity types work in a similar way as the Text entity type. There is one important difference. Lists provide a way to give boundaries to what the AI will match.

Let's say we take the above example of a departure and destination city. Perhaps you only want a select number of cities a person can travel to. You can limit these options by creating a custom entity list type named city. Within this entity type you provide only matched city names that are valid to travel to.

 

examples-entities-city3.png

When the AI detects an entity of this custom entity type it will determine if it matches. The AI could find an entity that is not valid in relation to the type of entity, For example, I want to travel to Washington could be matched in the above example, but since Washington is not in our list, it's immediately dropped

Tip: Matching is fuzzy

No need to add typos as synonyms, matching is case insensitive and works with grammar errors

Exact matching

By default list entity types will do an exact match. This means, it will only match an extracted value if it is actually present inside the list. When you turn of exact matching, custom entities will actually work the same way as a regular Text entity type works.

Using extracted data

Any extracted data is present within params. Those params can be used within Code, webhooks and string templates.

The following example shows a try-out window that displays extracted data within the right side bar.

 

extract-tryout-params4.png

Params always come in the form of an array, even when only one match is found.

The following shows an example of an extracted entity within a param named destination.

// Example payload inside cloud code with a destination match
{
    params: {
        destination: [{
            match: "NYc",
            value: "New York",
            id: "NY"        
        }]
    }
}

Multiple destinations would look like

// Example payload inside cloud code with a 2 destination matches
{
    params: {
        destination: [{
            match: "NYc",
            value: "New York",
            id: "NY"
        },{
            match: "Amsterdam",
            value: "Amsterdam",
            id: "AMS"
        }]
    }
}

Filling user profile attributes

User profile attributes can be set using params by giving them a system defined name.

The following profile data can be filled:

  • user.name
  • user.profile.fullName
  • user.profile.firstName
  • user.profile.lastName
  • user.profile.gender (M/F/U)
  • user.profile.locale
  • user.profile.timezone (offset from UTC, -1)
  • user.profile.email
  • user.profile.picture (url)

Read more

We've written a number of articles about use cases for extracted data:

Tags (1)
Version history
Last update:
‎06-16-2021 10:14 AM
Updated by:
Contributors