fbpx

About Us

Leveraging the intellectual capital and expertise of faculty in the Carlos Alvarez College of Business at UTSA, the Data Analytics Center (DAC) provides high-end data science, analytics and artificial intelligence (AI) solutions to academic and industry partners. We create and warehouse proprietary data sets that are unavailable elsewhere and can provide organizations with cradle-to-grave data analysis and visualization services to help you make better business decisions.

Using state-of-the-art technology and encryption security, the DAC is committed to the responsible use of data and privacy protection.

Our solutions accelerate and elevate decision-making, leveraging the latest mathematical, statistical and technological techniques to unlock the power of data.

Primary Services

  • Application Development for Near Real-Time Access and Decision Making
  • Data Visualization and Exploration Tools
  • Secure Data Acquisition, Storage and Curation
  • Scalable High-End Analytics, Data Science and AI Solutions

Data Sets Available

Business Transcripts

  • Conference call transcripts for publicly-traded companies (United States)
  • Comprehensive data set for feature presentation
  • Personality/behavioral features available for some speakers (e.g., CEOs)
  • Comprehensive metadata available (firm and speaker levels)

BoardEX

  • Acquired with subscription
  • Board Member Network Database

Financial Analyst Records

  • ~1946 – 2017
  • SEC/FINRA style data replication
  • Exams, certification, registration history, disclosure events

Political Contributions

  • Contributor, recipient, date

SAN Internet Storm Center

  • Replication + derived features

SEC Filings

    • ~1994 – Present
    • Full replication and feature parsing ongoing
    • Largest data set (size, records, and features)

Stock Market Data

      • Minute interval for ~4,000 tickers
      • Close, open, high, low, volume

Stock Twits

    • Twitter for day traders
    • Comments & metadata

Movies

    • Script level data for ~2k movies
    • Actor/(Character) line level
    • Metadata (writers, genre, title, etc.)
    • Financial data (exceeds ~2k script presence)
      • Domestic and foreign
      • Metric aggregate varieties

Reddit

      • Comments
      • Thread metadata
      • Author metadata
      • Flair/tag metadata
      • Discussion hierarchy preservation
      • On-request/on-demand

4Chan

      • ~2003-Present
      • Similar as Reddit data with respect to feature types

Songs

    • Lyrical data for ~ 800,000 songs
    • Metadata (artist, album, title)
    • Basic text features (e.g. – ngrams)

Other helper data sets

  • Wordnet – text features
  • Census -name, gender and birthdate
  • IPGeo – IP geolocation data)
  • Surname origin

Data Analytics Center

Business Building, Room 1.01.14
1 UTSA Circle
San Antonio, TX
78249

Office Hours: Monday – Friday | 8 a.m. – noon and 1 – 5 p.m. or by appointment

X