All language subtitles for 1. Importing Data into Excel

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French Download
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:05,560 --> 00:00:11,650 In this section of the course, we're going to explore some of the techniques we can use to prepare 2 00:00:11,650 --> 00:00:13,510 our data for analysis. 3 00:00:14,170 --> 00:00:20,590 Now, Excel is well known for its data analysis tools, so things like pivot tables and charts and different 4 00:00:20,590 --> 00:00:22,570 types of calculations and formulas. 5 00:00:23,020 --> 00:00:29,080 But what's lesser known is how important it is to make sure that the data that you're going to analyze 6 00:00:29,320 --> 00:00:36,850 is clean and in an appropriate state, because many times data that we're using isn't necessarily a 7 00:00:36,850 --> 00:00:38,380 data set that we've created. 8 00:00:39,040 --> 00:00:45,730 So it might be that you've received a data set from a client or a colleague, or maybe you've even downloaded 9 00:00:45,730 --> 00:00:49,510 data from an external system or some third party software. 10 00:00:50,170 --> 00:00:52,630 And as we all know, we don't live in a perfect world. 11 00:00:52,630 --> 00:00:59,560 So it might be that when you've done that, download the data that comes into Excel isn't in the best 12 00:00:59,560 --> 00:01:01,030 format for analyzing. 13 00:01:01,660 --> 00:01:05,110 It might be that you have blank cells, all blank rows everywhere. 14 00:01:05,410 --> 00:01:08,860 Maybe there's weird formatting or inconsistent case. 15 00:01:09,520 --> 00:01:13,060 Maybe they're spelling errors or duplicate values. 16 00:01:13,270 --> 00:01:18,850 All of these things need to be dealt with before you start to create things like pivot tables and pivot 17 00:01:18,850 --> 00:01:19,390 charts. 18 00:01:19,690 --> 00:01:22,090 And that is what we're going to look at in this section. 19 00:01:22,480 --> 00:01:27,250 I'm going to show you lots of different techniques when it comes to cleaning data. 20 00:01:27,490 --> 00:01:30,700 So in this first lesson, we need to get some data. 21 00:01:30,700 --> 00:01:31,420 First of all. 22 00:01:31,960 --> 00:01:36,190 And this is going to give me an opportunity to show you a couple of other techniques when it comes to 23 00:01:36,190 --> 00:01:37,690 importing data. 24 00:01:37,930 --> 00:01:43,690 So what we're going to do here is we're going to import a data set first from a folder that contains 25 00:01:43,720 --> 00:01:48,250 numerous excel files and then we're going to import a text file. 26 00:01:48,550 --> 00:01:51,520 Now for this, we need to start with a blank document. 27 00:01:51,790 --> 00:01:56,470 So I'm going to use the shortcut key of control and to create a new blank workbook. 28 00:01:56,800 --> 00:02:01,450 Now, to import data into Excel, we need to go to the data tab. 29 00:02:01,540 --> 00:02:05,050 And it's this first group here that gets and transform data. 30 00:02:05,590 --> 00:02:09,640 And this little group tends to change in every new release of Excel. 31 00:02:09,790 --> 00:02:14,680 They usually change the wording and where they have things positioned in this little area, but this 32 00:02:14,680 --> 00:02:17,800 is how it looks right now in Excel 2021. 33 00:02:17,950 --> 00:02:23,680 Now notice in here we have a few different buttons which will allow us to quickly import data from a 34 00:02:23,680 --> 00:02:29,650 text or CSV file, from the web, from another table or range of cells. 35 00:02:29,950 --> 00:02:33,310 We can go to recent sources or look at existing connections. 36 00:02:34,150 --> 00:02:41,740 We also have a get data dropdown over here, which gives us access to all these different sources to 37 00:02:41,740 --> 00:02:42,880 import data from. 38 00:02:43,420 --> 00:02:49,990 And you can see there are so many so you can do things like import data from a SharePoint list, or 39 00:02:49,990 --> 00:02:57,310 maybe you want to import something from Microsoft Exchange or even an external application like Salesforce. 40 00:02:57,550 --> 00:03:04,300 Maybe you want to import something from an access database or an Oracle database, or maybe you just 41 00:03:04,300 --> 00:03:11,140 want to import from a different file that might be an Excel file, a text file, maybe a PDF, or even 42 00:03:11,140 --> 00:03:12,040 from a folder. 43 00:03:12,700 --> 00:03:19,090 So lots of different sources that we can import data into excel from now before we get to this point. 44 00:03:19,120 --> 00:03:24,700 Let me show you the files that we're going to import, and you'll find these to download in the Course 45 00:03:24,700 --> 00:03:25,450 Files folder. 46 00:03:25,870 --> 00:03:30,100 Now, in the course files, I have a folder here called sales data. 47 00:03:30,710 --> 00:03:36,670 If I double click to open this folder, I have in here for files that are all exactly the same. 48 00:03:36,970 --> 00:03:39,910 They just contain sales data for different years. 49 00:03:40,540 --> 00:03:47,530 But if I was to open each one, they have the same format, so they all have the same column headings. 50 00:03:47,740 --> 00:03:52,930 They just contain different sales data for each of the years now because these files are pretty much 51 00:03:52,930 --> 00:03:55,430 identical in their layout. 52 00:03:55,450 --> 00:03:59,800 It means that I can import them and combine them all in one go. 53 00:04:00,460 --> 00:04:08,980 So instead of having four separate files for the 2016, 2017 and 2019 data, what I can say to excel 54 00:04:08,980 --> 00:04:15,640 on the import is import all four of these from the folder and combine them together into one big, long 55 00:04:15,640 --> 00:04:16,120 file. 56 00:04:16,360 --> 00:04:18,970 So let's go back to excel. 57 00:04:19,750 --> 00:04:22,540 Let's go to get data from file. 58 00:04:22,780 --> 00:04:27,520 And because I have these stored in a folder, I can choose to import them from the folder. 59 00:04:27,760 --> 00:04:29,700 Let's select the folder. 60 00:04:29,710 --> 00:04:31,050 I don't need to drill down into it. 61 00:04:31,060 --> 00:04:35,050 I just need to select the folder that contains the files and click on Open. 62 00:04:36,660 --> 00:04:42,570 Now, what Excel will do here is it's going to take you into power query, and you may not be all that 63 00:04:42,570 --> 00:04:45,570 familiar with power query or you might have used it briefly. 64 00:04:45,900 --> 00:04:50,910 We're not going to go too far into power query because it is more of an advanced topic, but I am going 65 00:04:50,910 --> 00:04:53,370 to show you a couple of small things that you can do here. 66 00:04:53,610 --> 00:04:58,380 So what Excel has done is it's picked up those four files and you can see them sitting here. 67 00:04:58,380 --> 00:05:01,980 So sales 2016 to sales 2019. 68 00:05:02,790 --> 00:05:07,890 And then at the bottom, I have a few different options with regards to what I can do with these files. 69 00:05:08,610 --> 00:05:10,050 Now I can choose to combine them. 70 00:05:10,050 --> 00:05:16,590 At this stage, I can simply load them straight into excel, or I can transform the data. 71 00:05:17,430 --> 00:05:24,030 Now, transforming the data will open these files, empower query and will give you access to all of 72 00:05:24,030 --> 00:05:26,700 power queries tools for cleaning data. 73 00:05:26,970 --> 00:05:30,960 Now, the focus of this section of the course isn't to learn how to clean data. 74 00:05:30,960 --> 00:05:34,950 Simple query it's to learn how to clean data in Excel. 75 00:05:35,610 --> 00:05:39,300 However, let's click Transform data so you can see what this looks like. 76 00:05:39,630 --> 00:05:42,840 So basically, I'm now in the power query editor. 77 00:05:43,350 --> 00:05:46,860 And as I said, don't worry too much about what you see on here. 78 00:05:47,190 --> 00:05:52,860 The only thing I really want to show you here is how you can use power query to combine these four files 79 00:05:52,860 --> 00:05:53,340 together. 80 00:05:54,060 --> 00:05:56,940 So currently, there's still four separate files. 81 00:05:57,330 --> 00:06:02,940 What I can do here is click these two dropdown arrows and when I hover over it, says combined files, 82 00:06:03,270 --> 00:06:06,540 it's going to give me a preview as to what the first file looks like. 83 00:06:06,930 --> 00:06:14,220 And if I click on OK, it's going to combine those files together and load the data into the power query 84 00:06:14,220 --> 00:06:14,730 window. 85 00:06:15,510 --> 00:06:21,870 So now I basically have a big long list of all of those files instead of four separate files. 86 00:06:22,110 --> 00:06:26,100 Now that I've done that, I can load it into excel and start to clean it. 87 00:06:26,880 --> 00:06:30,180 So I have a close and load button at the top here. 88 00:06:32,060 --> 00:06:38,060 And what this does is it basically creates a connection between that folder and excel. 89 00:06:38,210 --> 00:06:44,870 So if maybe at the end of the year, I then get a file called Sales 2020 in order to update this Excel 90 00:06:44,870 --> 00:06:51,650 spreadsheet, all I would need to do would be to copy the sales 2020 file to that sales data folder 91 00:06:52,070 --> 00:06:53,990 because we're connected to the folder. 92 00:06:54,470 --> 00:07:00,080 The connection will update and automatically include all the 2020 data in this file. 93 00:07:00,260 --> 00:07:03,230 So that is why this method can be pretty useful. 94 00:07:03,380 --> 00:07:08,810 Now I'm going to close down the queries and connections because once it's been imported, notice that 95 00:07:08,810 --> 00:07:11,450 it's automatically been placed in a table. 96 00:07:12,110 --> 00:07:13,020 How do I know that? 97 00:07:13,040 --> 00:07:15,650 Will I have the table design ribbon at the top? 98 00:07:15,980 --> 00:07:21,860 And I have a table name called Sales Data, which now includes all four of those files. 99 00:07:22,130 --> 00:07:25,520 Now there might be a couple of things I want to do here to tidy up this data. 100 00:07:26,300 --> 00:07:32,060 For example, I don't really need Column A, which is just basically listing the file name. 101 00:07:32,630 --> 00:07:35,180 So I'm probably going to delete out that column. 102 00:07:35,900 --> 00:07:42,740 I might want to apply some formatting to these columns a control shift down arrow and let's go to home 103 00:07:42,740 --> 00:07:44,960 and apply some currency formatting. 104 00:07:45,200 --> 00:07:48,260 And there might be some other bits and bobs that I want to do here. 105 00:07:48,560 --> 00:07:55,160 So that is one way that I can import a folder of files into excel, and you don't necessarily have to 106 00:07:55,160 --> 00:07:56,390 always import a folder. 107 00:07:56,390 --> 00:08:02,330 If you just had one excel file, you could go from file from workbook and then you can just select the 108 00:08:02,330 --> 00:08:03,650 file that you want to import. 109 00:08:04,040 --> 00:08:08,330 Now what we're actually going to do and this next file is the file that we're actually going to be cleaning 110 00:08:08,330 --> 00:08:11,450 up is we want to import a text file. 111 00:08:12,080 --> 00:08:15,500 And again, you're going to find this file in the course files folder. 112 00:08:15,680 --> 00:08:23,630 So if we go up to the data tab again in this get and transform data group, I have a from text CSP button. 113 00:08:24,200 --> 00:08:29,660 So let's click and it's this file here that we want to import sales data dot text. 114 00:08:29,930 --> 00:08:33,980 Now this is going to open a preview of this text file. 115 00:08:34,250 --> 00:08:40,250 I have the option of opening this simple query and cleaning up in that application by clicking the Transform 116 00:08:40,250 --> 00:08:41,090 data button. 117 00:08:41,420 --> 00:08:45,890 But this time, I'm simply going to load it straight into excel and clean it there. 118 00:08:46,160 --> 00:08:48,230 So let's click on the load button. 119 00:08:49,510 --> 00:08:56,470 And there we go, so once again, it's loaded this data in and it's created a table automatically out 120 00:08:56,470 --> 00:08:57,220 of this data. 121 00:08:57,610 --> 00:09:03,490 And it's this data set that we're going to take the time to clean up because notice there are a few 122 00:09:03,490 --> 00:09:07,510 different issues going on in here now to make this a bit easier to see. 123 00:09:07,540 --> 00:09:13,000 I'm going to go to table design and just turn off banded rows so we can see these issues a little bit 124 00:09:13,000 --> 00:09:13,480 clearer. 125 00:09:14,350 --> 00:09:19,810 Now I can see that in this spreadsheet, I have a couple of blank rows in here which I need to get rid 126 00:09:19,810 --> 00:09:20,140 of. 127 00:09:20,710 --> 00:09:26,080 I can also see Column B. I have some inconsistent casing issues going on. 128 00:09:26,520 --> 00:09:30,700 I also have some weird spacing at the front of these words. 129 00:09:30,910 --> 00:09:36,400 I can see that many of these columns haven't got the correct formatting applied, and there might well 130 00:09:36,400 --> 00:09:39,130 be duplicates lurking around in here as well. 131 00:09:39,880 --> 00:09:44,950 So all of these are things that we need to consider when we're cleaning our data sets. 132 00:09:45,190 --> 00:09:49,870 So we're going to start out with this dataset in the next lesson, and I'm going to show you how you 133 00:09:49,870 --> 00:09:55,840 can quickly remove all blank rows and even blank cells from your worksheets. 14132

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.