Import cs50 import csv # Create database by opening and closing an empty file first So we can filter out those values and convert the startYear to an integer to filter for shows after 1970: But it turns out, for some of the rows, startYear has a value of \N, and that’s a special value from IMDb when they want to represent values that are missing. Now, we can open shows0.csv and see a smaller set of data.If row = "tvSeries" and row = "0" : # Write row writerow () # Iterate over TSV fileįor row in reader : # If non-adult TV show writer ( shows ) # Write header of the columns we want With open ( "shows0.csv", "w" ) as shows : # Create writer DictReader ( titles, delimiter = " \t " ) # Open new CSV file for writing With open ( "", "r" ) as titles : # Since the file is a TSV file, we can use the CSV reader and change We’ll write import.py to read the file in:.But our IDE doesn’t have enough space, so we’ll use our Mac’s terminal instead. We’ll download the file into our IDE with wget, and then gunzip to unzip it.But the file has more than 6 million rows, so even searching for one value takes a moment. We take a look at after we’ve unzipped it, and we see that the first rows are indeed the headers we expected and each row has values separated by tabs.genres, a comma-separated list of genres, like Drama,History.startYear, the year a title was released, like 2016.primaryTitle, the main title used, like The Crown.titleType, the type of the title, like tvSeries.tconst, a unique identifier for each title, like tt4786824.For example, we can download, which will contain basic data about titles:.IMDb, or “Internet Movie Database”, has datasets available to download as TSV, or tab-separate values, files.We can even delete an entire table altogether with another command, DROP.And we can remove matching rows with DELETE FROM table WHERE condition, as in DELETE FROM favorites WHERE title = "Friends".For example, we might say UPDATE favorites SET title = "The Office" WHERE title LIKE "%office", and that will set all the rows with the title containing “office” to be “The Office” so we can make them consistent. We can update data with UPDATE table SET column=value WHERE condition, which could include 0, 1, or more rows depending on our condition.JOIN, combining data from multiple tables.WHERE, matching on some strict condition.There are also other operations we can combine as needed:.DISTINCT, for getting distinct values without duplicates.After inserting values, we can use functions to perform calculations, too:.SQLite is one database application that supports SQL, and there are many companies with server applications that support SQL, includes Oracle Database, MySQL, PostgreSQL, MariaDB, and Microsoft Access.varchar(n), for a variable number of characters, up to a certain limit.char(n), for an exact number of characters.numeric(scale,precision), which solves floating-point imprecision by using as many bits as needed, for each digit before and after the decimal point.BLOB, for “binary large object”, raw binary data that might represent files.SQL, too, has its own data types to optimize the amount of space used for storing data:.First, we’ll need to insert a table with the CREATE TABLE table (column type.In SQL, the commands to perform each of these operations are:.It turns out that, when working with data, we only need four operations:.We can download a CSV file from the spreadsheet with “File > Download”, upload it to our IDE, and see that it’s a text file with comma-separated values matching the spreadsheet’s data.We look thorugh the responses, and see that the spreadsheet has three columns: “Timestamp”, “title”, and “genres”: For example, we created a Google Form to ask students their favorite TV show and genre of it.A database is an application that can store data, and we can think of Google Sheets as one such application.Most of us are familiar with spreadsheets, rows of data, with each column in a row having a different piece of data that relate to each other somehow.With thanks to CS50’s alumni and friends Lecture 7
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |