Stackoverflow Python Tkinter Read From Sqlite3 Database

How to Download the Stack Overflow Database

I employ a Microsoft SQL Server version of the public Stack Overflow information export for my blog posts and training classes because it'south style more interesting than a lot of sample information sets out there. Information technology'due south easy to learn, has simply a few easy-to-sympathize tables, and has existent-world information distributions for numbers, dates, and strings. Plus, information technology'southward open source and no charge for you – just choose your size:

  • Small: 10GB database as of 2010: 1GB straight download, or torrent or magnet. Expands to a ~10GB database called StackOverflow2010 with data from the years 2008 to 2010. If all you need is a quick, easy, friendly database for demos, and to follow along with code samples here on the blog, this is all yous probably demand.
  • Medium: 50GB database as of 2013: 10GB direct download, or torrent or magnet. Expands to a ~50GB database called StackOverflow2013 with data from 2008 to 2013 information. I employ this in my Fundamentals classes because it's large enough that slow queries will actually be kinda slow.
  • Big: current 401GB database as of 2021/02: 54GB torrent (magnet.) Expands to a ~401GB SQL Server 2016 database. Because it's so large, I only distribute information technology with BitTorrent, not direct download links.
  • For my training classes: specialized re-create equally of 2018/06: 47GB torrent (magnet.) Expands to a ~180GB SQL Server 2016 database with queries and indexes specific to my training classes. Because it'due south so large, I only distribute it with BitTorrent, not directly download links.

Afterwards y'all download it, extract the .7Zip files with 7Zip. (I use that for max compression to keep the downloads a trivial smaller.) The extract will have the database MDF, NDFs (additional data files), LDF, and a Readme.txt file. Don't extract the files directly into your SQL Server's database directories – instead, extract them somewhere else start, and then move or re-create them into the SQL Server's database directories. You're going to screw up the database over time, and yous're going to want to kickoff once more – keep the original copy so you don't accept to download it again.

Then, adhere the database. It's in Microsoft SQL Server 2008 format (2005 for the older torrents), so you can adhere it to whatsoever 2008 or newer example. It doesn't utilise whatsoever Enterprise Edition features like partition or compression, then you can attach it to Developer, Standard, or Enterprise Edition. If your SSMS crashes or throws permissions errors, y'all likely tried extracting the archive directly into the database directory, and you've got permissions problems on the data/log files.

As with the original information dump, this is provided nether cc-by-sa 4.0 license. That means you are free to share this database and adapt it for any purpose, even commercially, but you must attribute it to the original authors (not me):

so-logo

  • Original Stack Commutation data dump
  • Stack Exchange, an crawly Q&A network

What's Inside the StackOverflow Database

I want you to become started quickly while still keeping the database size small, so:

  • All tables accept a clustered index on Id, an identity field
  • No other indexes are included (nonclustered or total text)
  • The log file is minor, and you should abound it out if you program to build indexes or modify data
  • It just includes StackOverflow.com data, non data for other Stack sites

To get started, here's a few helpful links:

  • This Meta.SE post explains the database schema.
  • If y'all want to learn how to tune queries, Data.StackExchange.com is a fun source for queries written by other people.
  • For questions about the data, bank check the data-dump tag on Meta.StackExchange.com.

Past Versions

I also keep past versions online too in example yous need to see a specific version for a demo.

  • 2020-06 – 46GB torrent (magnet.) Expands to a ~381GB SQL Server 2008 database. Yes, smaller torrent and larger database because I went wild and crazy with the compression. Took freakin' 36 hours to shrink.
  • 2019-12 – 52GB torrent (magnet.) Expands to a  ~361GB SQL Server 2008 database.
  • 2019-09 – 43GB torrent (magnet.) Expands to a  ~352GB SQL Server 2008 database. This is the last export licensed with the cc-by-sa 3.0 license.
  • 2019-06 – 40GB torrent (magnet.) Expands to a  ~350GB SQL Server 2008 database.
  • 2018-12 – 41GB torrent (magnet.) Expands to a ~323GB SQL Server 2008 database.
  • 2018-09 –39GB torrent (magnet.) Expands to a ~312GB SQL Server 2008 database.
  • 2018-06 – 38GB torrent (magnet.) Expands to a ~304GB SQL Server 2008 database. Starting with this version & newer, the giant PostHistory table is included. As yous can probably estimate by the name, this would make for splendid partitioning and archival demos. As you might not judge, the NVARCHAR(MAX) datatypes of the Annotate and Text fields brand those demos rather…challenging.
  • 2017-12 – 19GB torrent (magnet.) Expands to a ~137GB SQL Server 2008 database.
  • 2017-08 – 16GB torrent (magnet), 122GB SQL Server 2008 database. Starting with this version & newer, each tabular array's Id fields are identity fields. This mode we can run existent-life-style insert workloads during my Mastering Query Tuning class. (Prior to this version, the Id fields were just INTs, then yous needed to select the max value or some other trick to generate your ain Ids.)
  • 2017-06 – 16GB torrent (magnet), 118GB SQL Server 2008 database. Starting with this torrent & newer, I broke this up into multiple SQL Server data files, each in their own 7z file, to make pinch / decompression / distribution a lilliputian easier. You demand all of those files to attach the database.
  • 2017-01 – 14GB torrent (magnet), 110GB SQL Server 2008 database
  • 2016-03 – 12GB torrent (magnet), 95GB SQL Server 2005 database
  • 2015-08 – 9GB torrent (magnet), 70GB SQL Server 2005 database

Why are Some Sizes/Versions But On BitTorrent?

BitTorrent is a peer-to-peer file distribution system. When you download a torrent, you lot also go a host for that torrent, sharing your ain bandwidth to help distribute the file. Information technology'due south a free style to get a large file shared amongst friends.

The download is relatively large, and so it would be expensive for me to host on a server. For example, if I hosted it in Amazon S3, I'd have to pay around $5 USD every time somebody downloaded the file. I like yous people, just non quite plenty to go around handing you lot dollar bills. (As information technology is, I'one thousand paying for multiple seedboxes to keep these bachelor, heh.)

Some corporate firewalls understandably cake BitTorrent because it can utilise a lot of bandwidth, and it tin can also be used to share pirated movies/music/software/whatever. If you have difficulty running BitTorrent from work, y'all'll need to download it from dwelling instead.

smithtrall1972.blogspot.com

Source: https://www.brentozar.com/archive/2015/10/how-to-download-the-stack-overflow-database-via-bittorrent/

0 Response to "Stackoverflow Python Tkinter Read From Sqlite3 Database"

Publicar un comentario

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel