Sports data

Sports data

Sports data are typically published online and in newspapers as box scores.Box scores contain a numerical view of a sporting event and are of interest for sports betting and fantasy sports.While box scores contain a wealth of information (e.g. Box score) they are impractical for performing research.

The open source based (python (programming language)) sports data query language (SDQL) in use at free web sites [http://KillerSports.com] [http://SportsDataBase.com] allows users to ask questions such as "How do the Cubs do after their starter got knocked out in the third?" and "How do team do after playing extra innings the night before?"

The sports data query language (SDQL) has the same basic format for all sports:
GAME REFERENCE:PARAMETER

Game references include
p for the team previous game
P for the previous match up
o for the team's opponent
n for the next game
N for the next match up

Baseball

Because of the importance of pitching, baseball has the additional game references
s for starter's last start
S for starter's last start against the current opponent.

For baseball common parameters include:
at bats, attendance, biggest lead, conference, date, day, division, double header, earned runs, errors, fly balls, ground balls, hits, hitters faced, home runs, inning runs, left on base, line, losses, margin, matchup losses, matchup wins, month, over, pitchers used, pitches, playoffs, rbi, rest, run line, run line runs, runs, season, series game, series games, site, start time, starter, starter earned runs, starter era, starter fly balls, starter ground balls, starter hits, starter hitters faced, starter home runs, starter innings pitched, starter losses, starter matchup losses, starter matchup wins, starter pitches, starter rest, starter runs, starter strike outs, starter strikes, starter throws, starter walks, starter wins, starters previous game, starters previous match up, streak, strike outs, strikes, team, team left on base, temperature, time zone, total, umpires, under, walks, wins

MLB parameter description and sample queries

"'attendance" - This is the reported attendance. It can be useful to see how teams perform as a function of the attendance.
Sample SDQL query: attendance<10000

biggest lead - This is the biggest lead for a team in the game. It can be useful to see how teams perform after a loss in which they held, say, a three-run lead, or off a win in which they never trailed.
Sample SDQL query: biggest lead = 0

conference - This is league: either NL or AL. It can be useful, for example, to see how teams perform in interleague games.
Sample SDQL query: conference = NL

date - This is the date in eight-digit format. This is useful for setting a search-from date, for a recently emerging trend or system. June 10th 2006 is represented as 20060610.
Sample SDQL query: date>20050804

day - This is the day of the week. The day must be spelled out completely with the first letter capitalized. This is useful to uncover how teams perform on a particular day of the week.Sample SDQL query: day=Sunday

hits - This is the number of hits by the team. It can be useful to see how teams perform, for example, when they have had double digit hits two games in a row.Sample SDQL query: hits>=12

inning runs - This powerful parameter can be used to investigate how a team performs based on the number of runs they scored in a particular inning or range of innings. For example, how a team performs after a loss in which they scored at least 3 runs in the first inning or after a win in which they were shut out in the last six innings.Sample SDQL query: inning runs [:3] >=5

left on base - This is the number of runners left on base by the hitters. This is not to be confused with team-left-on-base. The difference can be demonstrated with two examples. In one inning the first three batters walk and the next three strike out. The 'left on base' for this inning is NINE - three for each batter. If the first two batters strike out, the next three walk and the last batter strikes out, the 'left on base' is only three. In both cases, the 'team left-on base' is three. This parameter is useful, for example, to see how a team performs after a one-run loss in which their hitters stranded at least five more runners than the opponent.Sample SDQL query: left on base < 5

line - This is the Vegas line. The lines are stored in five cent increments and ten cent lines are used. For example, a pick game is when both teams are minus 105. To query on favorites, set the line to less than minus 105 and to query on underdogs, set the line to greater than zero.Sample SDQL query: line < -199

margin - This is the margin by which a team won or lost. It is in units of runs. This is a very commonly used parameters and can be used, for example, to uncover how a team performs in the third game of a series when they lost the first two by a single run.Sample SDQL query: margin = 1

month - This is the month and they are numbered rather than named. This is to facilitate queries involving, say, after April, which would be greater than 4 because April is the fourth month.Sample SDQL query: month =4

pitchers used - This is the number of pitchers used by the team. It can be useful to see how teams perform, when they are off a win in which they used at least five pitchers or when they used at least five pitchers for two straight games. Sample SDQL query: pitchers used > 5

playoffs - This will allow the search of only regular season results and only playoff results. To search only playoff results, set the playoffs=1 and to search only regular season results, set playoffs=0. The KillerSports.com default is to search on both playoff and regular season results.Sample SDQL query: playoffs=0

rest - This is the number of days rest a team has between games. Usually it is one or zero, but rainouts can expand this number to two, three or even four.Sample SDQL query: rest > 0

runs - This is the totals number of runs a team scored in a game. It can be used, for example, to see how a team performs after being shut out, after scoring at least six runs and losing, or after scoring three runs or fewer in three straight games.Sample SDQL query: p:runs=0

season - This is simply the season. It actually is very useful to see how trends and systems have been evolving over the seasons. For example, let's say you uncovered a system in which underdogs have a record of 202-199 over the past four seasons, making a $100 player $4650. This is a very strong system, but how has it done recently? How has it done in the current season? By simply adding 'and season' to the end of the query, the results will be given for each season individually. Sample SDQL query: season > 2005

series game - This tremendously useful parameter is the number of the game in the series. This can be used, for example, to determine how a team performs in the second game of a home series when they lost the first as a favorite or in the third game of a three-game series when they lost the first two. Sample SDQL query: series game = 3

series games - This is the total number of the game in the series. For example, if the team is playing the first game of a three game series, the series games is 3 while the series game is 1. This can be used, for example, to determine how a team performs in the second game of a two-game home series when they won the first by at least five runs as an underdog.Sample SDQL query: series game = 4

site - This commonly used parameter is simply the site of the game. In rare instances when the site is neutral, the home team is the team that bats second.Sample SDQL query: site = home

start time - This is the start time of the game. All times are local and military time is used, with all four number in a single string. For example, 7:05 pm is 1905. Sample SDQL query: start time<1400

starter - This is the starting pitcher. The pitcher's first and last name must be given in single quotes. This allows a complete investigation of how a starting pitchers performs in various situations. For example, as a 150+ road dog, when their team lost his last two starts, when the team is on a three game losing streak or when he got fewer than three runs of support for two straight starts.Sample SDQL query: starter= 'Roger Clemens'

starter innings pitched - This is the number of innings pitched by the starter. This can be used to determine how a pitcher performs when he went eight-plus innings in his last start, when he is off a start in which he went less than four innings or when he went 7+ innings and his team lost his last start.Sample SDQL query: starter innings pitched < 5

starter throws - This is the handedness of the starting pitcher. It can be used, for example, to see how a team performs against a lefty when they faced righties for three straight games.Sample SDQL query: starter throws = left

strike outs - This is the number of times a team struck out in a game.Sample SDQL query: strike outs >= 10

team - This name of the team. The database at KillerSports.com uses the nickname of the team. This parameters can be used to see how a team performs vs another or how a team performs after a series vs a particular opponent.Sample SDQL query: team = Blue Jays

team left on base - This is the total number of runners stranded at the end of each inning. This parameters can be used to uncover how a team performs as a function of the number of runners than go from the basepath to the field.Sample SDQL query: team left on base >12

temperature - This is the reported temperature at the start of the game. It can be used, for example, to see how teams perform in cold temperatures or how starters perform in hot temperatures. Sample SDQL query: temperature < 60

total - This is the consensus Vegas OU line for the game. It can be used, for example, to see how a team performs when the OU line is high or how a starting pitcher performs on the road when the OU line is low.Sample SDQL query: total < 7.5

walks - This is number of walks the team drew - not the number of walks their pitchers allowed. It can be used, for example, to see how teams perform in games in which they did not draw a walk, or after a game in which they drew at least five walks. Sample SDQL query: walks = 0


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Sports data query language — The Sports Data Query Language or SDQL allows anyone with access to theinternet to investigate past results in the National Football League, National Basketball Association and in Major LeagueBaseball. Users can isolate billions upon billions of… …   Wikipedia

  • sports — /spawrts, spohrts/, adj. 1. of or pertaining to a sport or sports, esp. of the open air or athletic kind: a sports festival. 2. (of garments, equipment, etc.) suitable for use in open air sports, or for outdoor or informal use. [1910 15; SPORT +… …   Universalium

  • Data Design Interactive — Type Private Founded United Kingdom (1983) Headquarters Stourbridge, United Kingdom Data Design …   Wikipedia

  • Data driven journalism — is a journalistic process based on analyzing and filtering large data sets for the purpose of creating a new story. Data driven journalism deals with open data that is freely available online and analyzed with open source tools.[1] Data driven… …   Wikipedia

  • Sports diplomacy — is when sport is used as a political tool to enhance (or sometime worsen) diplomatic relations between two entities. The intention is sometimes to bring about radical change. While the Olympics is often times the biggest political example of… …   Wikipedia

  • Sports Hotel Tianjin (Tianjin) — Sports Hotel Tianjin country: China, city: Tianjin (Near Centre) Sports Hotel Tianjin Location Located in the city center, adjacent to Tianjin TV Tower and Tianjin gym.Rooms Rooms are small in size but very clean. All room features satellite… …   International hotels

  • Sports rating system — Removing backlinks to Laxpower because Article has been deletedA sports rating system is a system that analyzes the results of sports competitions to provide objective ratings for each team or player. Rankings are then derived by sorting each… …   Wikipedia

  • Data Durbar Complex — The Data Durbar Shrine. Data Darbar (or Durbar), located in the city of Lahore, Pakistan is one of the oldest Muslim shrines in the sub continent. It houses the remains of a Sufi saint, Abul Hassan Ali Hajvery (more commonly known as Daata Ganj… …   Wikipedia

  • Sports in Pittsburgh — Main Articles: Pittsburgh, Pittsburgh Metro Area, Pittsburgh Tri State Pittsburgh s dedication both to amateur and professional sports has a long history. Pittsburgh has been called the City of Champions for its success in sports, particularly… …   Wikipedia

  • Sports game — A sports game is a computer or video game that simulates the playing of traditional sports.Most sports have been recreated with a game, including baseball, association football, American football, boxing, wrestling, cricket, golf, basketball, ice …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”