DAX for Power BI Part 2.5 - Text in Calculated Columns

DAX for Power BI Part 2.5 - Text in Calculated Columns Welcome to this weizel dax for power bi tutorial in this part of the series we're going to look at how to work with text in your calculated columns so we'll start with some simple examples of concatenating strings using a variety of techniques and we'll look at how you can format different values in the strings you've concatenated as well we'll also show you how to extract text from a string whether that's from the left the right or the middle and look at how to find or search for characters to locate the position you want to extract the text from.

We'll explain how you can trim text using the trim function to get rid of any leading or trailing spaces and then towards the end of the video explain how to replace or substitute text within a larger string so let's get started to get started i've created a new blank report and as usual the first thing i'll do in here is import some data from an excel workbook so i'll pick one of the options which allows me to do that and then when i've browsed to the location that i've saved my movies.

Workbook in i'll double click on that to launch the import wizard as usual i'll drop a link in the video description so you can download this same file yourself once the dialogue is launched i'm going to check the box next to the movies worksheet and i can click the load button to import all this data into the report's data model when that's happened i can have a quick check in the fields list over here on the right-hand side to make sure that i've imported all the columns i'm expecting.

For the first basic calculated column we'll create in this video we're going to concatenate the values of two columns in this table together to form a larger piece of information but just before we do that i want to take a quick look at why it might be necessary so let's start by creating a visual in the report i'm going to add a basic clustered column chart and then i'm going to resize this so it fills the full size of the page and let's just get rid of the filters panel temporarily so we can see things a little more clearly.

Then in the axis bucket of the field well i'm going to drop the title field from the movies table and then in the valleys bucket i'm going to drop the run time and that's going to generate a sum of run time now ideally what i'm trying to do here is show one single column for each film in my data set but the problem is i have several films which have duplicated titles so some of these titles like the first one anna karenina is showing a total run time of 472 minutes.

That's a pretty long film even i wouldn't sit there watching that what i really want to do is separate this column out into however many versions of anaconda there are in the database just if you wanted to check that you do have duplicated values one quick simple way to do that in the chart is to change the sum of runtime by clicking on the drop down list next to it into a count and in there we'll see that we've got four different versions of anna karenina three different versions of godzilla etc.

DAX for Power BI Part 2.5 - Text in Calculated Columns

And two of many other films as well let's just switch back to a sum of runtime then we can head back to the data view by clicking the data button on the left hand side and then create our calculated column to make our film titles unique a simple way to make our film titles unique is to concatenate them with the unique id column that we've imported from the movies database i know these ids are unique because the data was originally exported from a sql server database where the id column was.

A primary key and that guarantees that the values in there must be unique so all we're going to do is create a new column in the usual way i'm going to click my new column button and then let's call this new column if i just zoom in so you can see what i'm doing i'll call this one title unique and i'll make it equal to and then we're going to use a function that's designed for concatenation it's called concatenate so i can insert the concatenate function you'll notice you can only concatenate two bits of text.

With this function but that's okay because we've only got two things we want in this first basic example so i'll refer to the film or the movie's title column first and then the movie's id column second i can close the round brackets and press enter and now i've got a unique title with the number of the film's id attached to the end of the title i appreciate it doesn't look great right now but we'll improve on that in a moment the important thing is that if we go back to the report view.

And we swap out the original title column for the title unique column we do now get a single column for each individual film so the longest film is no longer quite so long 472 minutes it's uh only 260. still quite long in my book just before we go back and start improving the appearance of the calculated column we've created it's worthwhile mentioning how you could generate your own id or index column if you don't already have one a convenient way to do that in power bi.

Desktop is to use the power query editor as it turns out a lot of the things we're going to be doing in this video using dax would be far more convenient and easy to do in the power query editor but just for the moment i'm going to get you to find the queries section on the home tab of the ribbon and then click the top half of that button to open the query editor now every single thing that you import into a power bi report generates a query just you don't normally see them unless you open up the power query editor.

What we're going to do here is head over to the add column tab and then choose index column and i'm just going to click the drop down arrow at the right hand side i'm going to choose to build my new index column from a value of one so i'm going to choose the from one option although it doesn't really matter too much and that's sufficient to create a branch new column starting at the number one for the first film and then simply incrementing by one for each additional film in the table so every value in that column is.

    Completely unique to save those changes and get back to power bi desktop i can head back to the

    Home tab here in the power query editor and then click the close and apply button to update those changes and add that new index column into my data model so you see that that one's appeared there so if i headed back to the data view we could easily have just concatenated the value of the index column rather than the value of the id column and that would have guaranteed that our film titles were unique as well.

    Now let's work on improving the appearance of the calculated column we created because at the moment just tagging the id number onto the end of the film name looks a bit messy and it's also potentially quite confusing as well look we've got lethal weapon 423 i think danny glover would definitely be too old for this stuff at that point in the franchise so what we'll do is insert a space after the film title but before the id the problem at the moment is that with the concatenate function we have no.

    Convenient way to do that i pointed out earlier that concatenate only has two parameters for the two individual bits of text you want to put together so what we'll do first if we did want to use concatenate just to show you that it is possible i'm going to replace movie's id with a literal space in some double quotes then what i would like to do is concatenate that concatenated text to the movie id so i could wrap another concatenate function around this existing.

    Concatenate function click at the end of the line type in a comma and then refer to the movies id i didn't say this was a good solution but just to show you and prove that it can work so if i close the extra set of round brackets for the extra concatenate that inserts a space makes things look a little neater now a slightly more elegant choice would be to use a function called combine values this isn't really strictly speaking what.

    The function is designed for but we can take advantage of it just to concatenate several bits of text with a space in between each piece of information so let's just get rid of everything on that line apart from co and i'm going to say combine values from the intellisense hit the tab key to see the list of parameters so with this one we begin with the delimiter what piece of information what character do you want to see in between each bit of text you put together so i'd.

    Like to put a space a literal space in some double quotes i can then type in a comma and refer to the first expression the first thing i want to concatenate so that's going to be the movie's title and then another comma and then i could refer to the the id column movie's id and if i wanted to i could carry on at that point by adding more commas and referring to more columns or writing more bits of text but for this first basic example i want to stop at movies id close around.

    Brackets and then hit enter and the end result will be the same as the nested concatenate function but in a slightly neater way now that we can list multiple values to concatenate much more easily we could try making the id stand out a little bit or separate it from the film title by wrapping it in some round brackets so to do that in the combine values function after the movie's title and i've typed in a comma i'm going to insert some literal text in some double quotes an open round bracket symbol.

    Followed by a comma and then movies id and then after movies id another comma and then a close round bracket symbol in some double quotes so when i hit enter now that will indeed place a set of round brackets around the id number but slightly annoyingly because of the way combined values works it also puts a space in between each value i don't really want to have that space there at all what would be really convenient for combined values would be to say well.

    Separate these things with an empty string so don't put any characters between them just let me choose which bits of information i put together in the way i want so i'd want a space after the movie title but before the open round bracket so i could type in a literal space there in those double quotes then i don't want a space before the movie's id or after the movie's id before the close round bracket that will be great but unfortunately when i press enter it won't work and the reason it doesn't work is.

    Because combined values can't accept an empty string as its first parameter as

    Its delimiter so this leads us to the probably the simplest solution for concatenating things don't bother trying to use functions to do it just use the the concatenate operator instead so i'm going to wipe out this entire expression and start with a reference to the movie's title and then use a single ampersand as the concatenate operator.

    Then i can choose whatever i want to concatenate so i'll put a space and an open round bracket in some double quotes followed by another ampersand and then the id and then another ampersand and then in some multiple quotes the closed round bracket finally when i press enter i'll have the result i want the film title with its unique id in some round brackets at the end the final thing i'd like to do with this column is make the piece of information inside the round brackets a little more.

    Meaningful to somebody reading the report so although the id is unique and it gives me the results in the chart that i want it's not exactly meaningful to somebody reading the report so let's replace the id with some of the information from the release date column i think it's fairly unlikely we'll release more than one film with the same title on the same date so a nice simple thing we could do there is get rid of the id field and refer to the release date field instead i'm just going to ignore this extra menu that's appeared for the moment and.

    Explain what that's about surely i'll press the escape key to clear that extra menu and then press enter to update my formula and see the complete date entered inside those round brackets now that's a little more useful but perhaps a little too much information perhaps i think we could probably get away with just displaying the year that the film was released in again it's pretty unlikely we'll release more than one film with the same title in the same year to return the year from the release date.

    We have a couple of choices i'm going to bring back that menu that i cleared earlier by typing in a full stop after the release date and that provides me with a list of properties of the date so if i wanted to have just the year i could insert the year from the list and then hit enter and that will show me just the year of the film's release if you didn't see this extra menu appear it may mean that you have a slightly different setting applied in power bi desktops settings to do with the.

    Automatic date time intelligence this is something we're going to cover in a lot of detail later on in this series i don't want to talk about it too much right now so if you don't see or have access to this extra properties list you can get the year of a release date by using a function instead so i'm going to get rid of the dot year property and then in front of the movies release date column reference i'm going to insert the year function so i'll refer to the year open some round brackets and then close some round brackets after the release date column.

    Press enter to update the formula and once again we've got the year of the film's release listed after its title and that is a particularly meaningful piece of information you can see it in several places both on the chart labels and in the chart tooltips when you hover the mouse cursor over one of the film names for the next example i'd like to take a look at what happens when we concatenate a formatted value so let's head back to the data view first and then i'm going to pick on the budget column and then head to the currency.

    Format tool and choose to format that as english united states so that means we'll get the dollar sign at the beginning and units of thousands separated by commas and a couple of decimal places there too now i'd like to create a calculated column that says something like the film's title followed by the word cost followed by the value of the budget column so to do that let's create a new column and we'll call this one something like film cost.

    Equals and then i'm going to refer to the title column first concatenate that with the word cost in some double quotes and i've deliberately included spaces either side of the word cost there you'll notice and then another ampersand and then a reference to the budget column now when i press enter we will indeed get the value of the budget column but it doesn't include any formatting if you do want to apply a format to a value when you're concatenating it with other values you can use their cleverly.

    Named format function so just in front of the reference to the movie's budget column i can write the word format open some round brackets and then fill in two parameters for the format function the first is the value that i want to apply the formatting to so that's the movie's budget followed by a comma and then the format i would like to apply now there are several things i can do with a format parameter there are some named formats i can apply for instance there's a named format called currency.

    If i close the double quotes and then close the round brackets now bear in mind i want to see us dollars as the currency symbol and you may be able to tell from my accent that i'm not actually based in the united states so when i press enter unfortunately for this example i get the pound sterling symbol instead that's not particularly helpful so in this case i'm going to get rid of that named currency format and write my own custom format code instead now these are the same custom format codes you can write in the format text.

    Box up here i'm going to start with a dollar symbol and then just a simple digit of zero and if i press enter now you'll see that i get the dollar symbol at the front of the number although i lack now some of the other fancier effects i don't get the comma separators or the decimal places adding the decimal places is relatively easy if i type in a full stop and then as many decimal values as i like i'll just go for the two that will bring back the two decimal places adding the comma separators is a little.

    More fiddly the pattern i need to write just after the dollar symbol is a hash comma zero so that represents the pattern of digits that might appear and that pattern is repeated for however large the number is if i press enter now i'll get the commas repeating every set of three digits there are a couple of other slightly more clever things i can do here as well just before the decimal point if i typed in a single comma there and then pressed enter what you'll see is that one set of.

    Three digits gets stripped off so i've scaled the number to a thousand if i typed in another comma at the end sorry before the decimal point and press enter again that strips off another set of three digits and effectively scales the number to a million so at this point i could actually add a letter let's add the letter m after the double decimal places hit enter again and now i've got my numbers reading a little bit more um succinctly 175.00 m.

    There will be some budgets that do actually have a decimal value in here somewhere there we go 72 and a half million there i think for this example i might even be tempted to get rid of the decimal places all together so let's just go for the double comma and then the m and that's a nice sort of label that i can read and maybe assign to some visuals in the report i'm kind of just scratching the surface of what you can do there with custom number formats and if you are interested in some more detail then there is a an entire page dedicated to the format function on the microsoft docs site so.

    I'll drop this link in the video description so you can get to this fairly easily so you'll see quite a few other cool things you can do with that but i think for our example that's probably good enough for now so now that we've spent some time putting text together let's look at how we can do the opposite and split some large bits of text into separate parts now this is the point at which if you wanted to do this for real in a proper report you definitely want to consider doing it in the power query editor.

    A lot of the things we're about to do using dax can be done really simply with a couple of clicks in power query we do have a few videos which talk you through how to do some of those things so in the power bi tutorial series in part 1.2 creating and publishing your first report and then in part three in the various sections in there we go into some detail in how the query editor works for this video of course this series is all about dac so we're going to have to do things the difficult way i'm afraid but we're going to start with a fairly simple example the first thing i'd like to do is just.

    Extract the first character from the film name and i'll show you why we're going to do that in just a moment but let's create the calculated column first so i'm going to click on the new column button and i'm going to call this new column film initial i'm going to make it equal to the result of a function called left which you may well have encountered in other microsoft products so i'm going to refer to the film title column the second parameter says how many characters from the left of the.

    Specified string you want to return you can omit that you can tell that it's listed in square brackets there and if you don't specify a number of characters you get just one that's the number that i want so i could omit the number of characters parameter i'm actually going to type in the number one it's nice to be specific about things like this if i then close around brackets and then press enter we'll extract the first character from the left of the film's title into a new column and this provides us with a nice way to create some kind of slicer or filter in.

    The report if we switch back to page one where we created our chart earlier on i'm going to give myself a bit of space up at the top here and then in that blank section i'm going to insert a slicer from the visualizations panel let's just reduce the size of that as well to fit in the space i've given it then i'm going to put the film initial field into the slicer and i'm going to change the layout of that slider as well so i'm going to go to the format the paint roller tool just below the visualizations panel.

    And in the general section change the orientation from vertical to horizontal so we can see there that we get a set of buttons we've effectively created a sort of directory type slicer so i could click on a single letter and see all the films whose title begins with that letter in the chart one additional thing it would be nice to do with this slicer is group all the numbers together into a single item that says say zero to nine we have a fairly convenient way we can do that if you look at the way these.

    Items are organized in the slicer they're sorted alphabetically or in ascending order so you can see that all the numbers appear before any of the letters so we could check if the first character we've extracted from the film's title is less than the letter a then we can group it together as a digit so to do that let's head back to the data view and then at the beginning of this expression we can write an if function so let's say if open some round brackets and then just for a neater.

    Layout i'm going to press shift and enter to take that left function down to the next line at the end of that i can check if that's less than the first letter the letter a of course followed by a comma and if that's true on the next line i want to calculate that that is a single label 0 to 9. i can then type in a comma after that and then shift and enter otherwise i.

    Just want to produce that same formula left movies title comma one so let's just copy and paste that i'll close the extra set of round brackets for the if function press enter and then if we have a look back at the report view we should see that our slicer now has a single option zero to nine and if i click on that it shows me any films whose first letter or first character is a numeric digit from zero to nine.

    So when the number of characters you want to return is the same on each row of the table things are fairly straightforward but it gets a bit trickier when the number of characters you want to return varies row by row to demonstrate that let's head back to the data view and i'd like to create a column which extracts the director's first name into a separate column the problem with this of course is that each director has a different number of characters in their first name so to solve this problem we're going to try to calculate the position of the.

    Character which separates the first name from the other names so we're going to try to locate the position of the space character in the director's full name we've got a couple of functions we can use to do this let's start by creating the new column and i'm going to call this one first name or director first name equals and then i could use either the search function or i could use the find function and the single major difference between these two functions is that find is case sensitive and search isn't.

    As we're trying to look for a space character it doesn't really matter which of the two we choose to use here so i'm going to go with the find function and the first two parameters are compulsory what i'm looking for so i'm looking for a literal space character and then within the director column the third parameter start position indicates which character you begin your search from if you miss that out it starts searching from character number one which is what i want the not found value parameter well let's.

    See why that's particularly useful if i close some round brackets here and press enter i end up with an error in the first name column the reason that's happened is because there's at least one row in which no space character has been found so you see the little error message here the search text provided could not be found in the given text so if i go back to the parameters and type in a comma after i've referred to the movie's director.

    Column i want to skip over the start position by typing in two commas to skip to the not found value so what i could do here is type in the number zero to say indicate that you haven't found that character and if i press enter now it will return a number to that column indicating the position of the space character so now that we know how many characters we want to return from the left we can pass the results of this function into.

    DISCLAIMER: In this description contains affiliate links, which means that if you click on one of the product links, I'll receive a small commission. This helps support the channel and allows us to continue to make videos like this. All Content Responsibility lies with the Channel Producer. For Download, see The Author's channel. The content of this Post was transcribed from the Channel: https://www.youtube.com/watch?v=GrU3sMLRQQE
Previous Post Next Post