Arion Research LLC

View Original

Understanding Business Decision Data

We recently published a research report on the Decision Intelligence (DI) market. It’s a market that I’ve followed for several years and a category of software that I find very compelling based on business value delivered. As I was working through building the report, I looked extensively at business data and data sources, an obvious point of failure in any automation and DI implementation. The following shows, in a somewhat simplistic way, the sources of “business data”:

As you can see, most of these sources are digital, and with some effort to deploy a data infrastructure plus some amount of integration work, can be managed effectively. That’s not to minimize the effort of course, including a way to ensure data quality and completeness, it requires resources, time and a clear data infrastructure strategy. That’s not in the scope of this article though, so I’ll leave that for another discussion. 

Digital Data Sources

Existing Company Data

Existing company data refers to all the information that a company has collected over time from various operations and interactions. This data can be categorized into several types:

  • Customer Data: Information about customers including personal details, purchase history, feedback, and interactions.

  • Financial Data: Accounting records, revenue, expenses, profit margins, and investment details.

  • Operational Data: Data related to the daily operations of the company such as supply chain information, logistics, and production metrics.

  • Employee Data: Information about employees including HR records, performance reviews, payroll, and benefits.

  • Sales and Marketing Data: Data from sales transactions, marketing campaigns, lead generation, and conversion rates.

  • Product Data: Information about products or services including specifications, performance metrics, and development logs.

  • Customer Support Data: Records of customer service interactions, issues resolved, and service feedback.

  • Market Data: Data regarding market trends, competitor analysis, and industry benchmarks.

  • Compliance and Legal Data: Information related to regulatory compliance, legal contracts, and risk management.

Internal Data Sources

Companies gather data from various internal sources, including but not limited to:

  • CRM Systems: Customer Relationship Management systems store customer interactions and sales data.

  • ERP Systems: Enterprise Resource Planning systems manage business processes including finance, HR, and supply chain.

  • HR Systems: Human Resource Management systems track employee information and performance.

  • Accounting Software: Financial data is managed through accounting and finance software.

  • Marketing Platforms: Tools and software used for managing marketing campaigns and analyzing results.

  • Customer Support Systems: Help desk and support ticket systems track customer service interactions.

  • Production Systems: Manufacturing execution systems (MES) and other operational software.

Tech Infrastructure to Aggregate and Manage Data

To effectively aggregate and manage this data, companies rely on a robust tech infrastructure which may include:

  • Data Warehouses: Central repositories for storing and managing large volumes of structured data from different sources.

  • ETL Tools: Extract, Transform, Load tools for consolidating data from various sources into a data warehouse.

  • Data Lakes: Storage repositories that hold vast amounts of raw data in its native format until needed.

  • Database Management Systems (DBMS): Software systems for creating and managing databases.

  • Data Integration Platforms: Tools that enable seamless data flow between disparate systems.

  • APIs: Application Programming Interfaces for connecting different software applications and systems.

  • Data Governance Tools: Solutions to ensure data quality, security, and compliance.

  • Analytics Platforms: Tools like BI (Business Intelligence) software for analyzing and visualizing data.

Problem with Disconnected Data Silos

Disconnected data silos present several challenges:

  • Data Inconsistency: Inconsistencies arise when different departments use separate systems, leading to conflicting information.

  • Limited Accessibility: Data stored in silos is often inaccessible to other parts of the organization, hindering cross-functional analysis.

  • Inefficiency: Managing and reconciling data across silos is time-consuming and resource-intensive.

  • Inhibited Decision Making: Lack of a unified view of data makes it difficult to derive actionable insights and make informed decisions.

  • Increased Costs: Duplicate data storage and processing in silos can lead to increased operational costs.

Data Lakes and Data Federation as a Part of the Solution

Data Lakes Data lakes offer a solution to data silos by providing a centralized repository for all data types (structured, semi-structured, and unstructured). Key advantages include:

  • Scalability: Can store massive amounts of data at a relatively low cost.

  • Flexibility: Allows storing raw data in its native format.

  • Accessibility: Provides a single source of truth, making data more accessible for analysis and reporting.

Data Federation Data federation involves creating a virtual database that allows querying and managing data across multiple sources without moving it. Key advantages include:

  • Unified Access: Provides a unified view of data across different systems.

  • Real-Time Integration: Enables real-time data integration and access without the need for ETL processes.

  • Cost-Effective: Reduces the need for extensive data storage by accessing data from its original source.

By leveraging data lakes and data federation, companies can break down data silos, ensure data consistency, and enable comprehensive analytics for better decision-making.

Web-based Real-time Data 

Types of Real-Time, Web-Based Data

  • Social Network Sources

    • Social Media Posts: Updates, comments, and shares from platforms like Facebook, Twitter, Instagram, LinkedIn, etc.

    • Engagement Metrics: Likes, shares, retweets, comments, and other forms of user interaction.

    • User Sentiment: Analysis of the sentiment behind social media mentions and comments.

    • Influencer Data: Information on influencers relevant to the business’s industry, including their reach and engagement.

    • Trends and Hashtags: Monitoring trending topics, hashtags, and discussions relevant to the business.

  • Website Behavioral Data

    • User Visits: Tracking the number of visitors, page views, and session duration.

    • Clickstream Data: Paths users take through the website, including clicks, navigation patterns, and page transitions.

    • Conversion Rates: Data on how many visitors complete desired actions like purchases, sign-ups, or downloads.

    • Bounce Rates: Percentage of visitors who leave the site after viewing only one page.

    • A/B Testing Results: Performance data from different versions of web pages or marketing campaigns.

    • User Demographics: Information about the users such as location, device used, and other demographic details.

  • Other Web-Based Data

    • E-commerce Transactions: Real-time sales data, shopping cart contents, and purchase history.

    • Ad Performance Data: Metrics from online advertising campaigns such as impressions, clicks, cost-per-click (CPC), and return on ad spend (ROAS).

    • Competitor Analysis: Data on competitors' website performance, online presence, and digital marketing efforts.

    • Review and Rating Sites: Customer reviews, ratings, and feedback from platforms like Yelp, Google Reviews, and TripAdvisor.

    • News and Blogs: Mentions of the business, industry news, and blog posts that could impact the business.

Collecting and Consolidating Real-Time Web-Based Data

  • Data Collection

    • APIs: Utilize APIs provided by social media platforms, web analytics tools, and other data sources to collect real-time data.

    • Web Scraping: Employ web scraping tools and techniques to gather data from websites and social media platforms that do not provide APIs.

    • Tracking Pixels: Implement tracking pixels on websites to collect user behavior and conversion data.

    • Third-Party Tools: Use third-party analytics and monitoring tools to aggregate data from various sources.

  • Consolidation into Existing Company Data

    • Data Integration Platforms: Use platforms like Apache Kafka, Apache Nifi, or cloud-based solutions like AWS Glue to integrate real-time data with existing datasets.

    • ETL Processes: Implement ETL (Extract, Transform, Load) processes to aggregate and transform real-time data before loading it into data warehouses or data lakes.

    • Data Warehousing: Store consolidated data in a central data warehouse where it can be accessed and analyzed alongside other existing company data.

    • Data Lakes: Utilize data lakes for storing raw, real-time data, allowing for flexible querying and analysis.

By effectively collecting, ensuring the quality of, and consolidating real-time, web-based data with existing company data, businesses can gain comprehensive insights, improve decision-making, and enhance their competitive edge.

Research and Third-party Data

Research-Based Data

  • Primary Research

    • Surveys and Questionnaires: Data collected directly from respondents through structured surveys.

    • Interviews: In-depth data gathered from one-on-one or group interviews.

    • Focus Groups: Insights collected from guided discussions with selected participants.

    • Observations: Data obtained by observing subjects in natural or controlled settings.

    • Experiments: Data from controlled experiments designed to test specific hypotheses.

  • Secondary Research

    • Industry Reports: Pre-existing reports published by industry experts, trade associations, and market research firms.

    • Academic Papers: Research findings published in academic journals.

    • Public Records: Government and public sector data, such as census data, economic indicators, and regulatory filings.

    • News Articles: Information from newspapers, magazines, and online news portals.

    • Market Studies: Reports and analyses from market research firms.

Third-Party Data

  • Buyer Intent Data

    • Behavioral Data: Information on online behavior indicating purchase intent, such as website visits, content downloads, and form submissions.

    • Purchase History: Data on previous purchases made by potential buyers.

    • Demographic Data: Information on potential buyers’ demographics, including age, gender, income level, and location.

  • Analyst Reports

    • Industry Analyses: In-depth analyses of industry trends, market dynamics, and competitive landscapes.

    • Forecasts and Projections: Future market trends and projections based on current data.

  • Research Reports

    • Market Research Reports: Comprehensive reports on specific markets, including size, growth, and key players.

    • Consumer Reports: Insights into consumer behavior, preferences, and trends.

  • Data Enrichment

    • Contact Information: Enhanced data with additional contact details like email addresses and phone numbers.

    • Firmographics: Additional information about companies, including size, industry, revenue, and key personnel.

Incorporating and Integrating Data

  • Data Collection and Integration

    • APIs and Data Feeds: Utilize APIs and data feeds provided by third-party data providers to integrate real-time data into existing systems.

    • ETL Tools: Employ ETL (Extract, Transform, Load) tools to aggregate and transform data from various sources into a standardized format.

    • Data Warehousing: Store the integrated data in a data warehouse for centralized access and analysis.

    • Data Lakes: Use data lakes for storing large volumes of raw data, providing flexibility for future analysis.

  • Example Uses

    • Market Analysis: Use secondary research and analyst reports to understand market dynamics and competitive landscapes.

    • Customer Segmentation: Apply buyer intent and enriched demographic data to segment customers and tailor marketing strategies.

    • Product Development: Leverage primary research data from surveys and focus groups to inform product development and improvements.

    • Sales Strategies: Utilize buyer intent data to identify high-potential leads and prioritize sales efforts.

    • Risk Management: Incorporate analyst reports and public records to assess market risks and regulatory compliance.

  • Need to Purchase Some Data Types

    • Exclusive Data Sets: Certain data, such as detailed market research reports and proprietary analyst insights, may require purchase due to their specialized and high-value nature.

    • High-Quality Data: Purchasing data from reputable third-party providers ensures accuracy, reliability, and comprehensiveness.

    • Timely Information: Access to real-time and up-to-date data often necessitates a subscription or purchase to maintain a competitive edge.

Potential Integration Methods

  • API Integration: Directly connect third-party data providers via APIs for seamless data flow into company databases.

  • ETL Processes: Implement ETL processes to aggregate, transform, and load data from multiple sources into a unified data warehouse.

  • Data Governance Tools: Use data governance tools to ensure the quality, security, and compliance of integrated data.

  • BI Tools: Leverage business intelligence tools to analyze and visualize integrated data for strategic decision-making.

  • CRM Integration: Integrate enriched customer data into CRM systems to enhance customer relationship management and personalization efforts.

By effectively incorporating and integrating research-based and third-party data, businesses can enhance their strategic planning, marketing efforts, product development, and overall decision-making processes.

Data Quality Assurance (QA)

Data Quality Assurance (QA) is the process of ensuring that the data collected, stored, and utilized by a company is accurate, consistent, complete, and reliable. This involves systematic procedures and practices designed to validate and maintain data integrity, from initial collection through various stages of processing and analysis. Data QA is crucial for companies because it directly impacts the reliability of business insights and decisions. Poor data quality can lead to erroneous conclusions, misguided strategies, and inefficient operations, ultimately affecting a company's bottom line and reputation. By implementing robust data QA processes, companies can ensure their data-driven decisions are based on trustworthy information, thereby enhancing operational efficiency, customer satisfaction, and competitive advantage.

Methods: 

  • Data Validation: Implement validation rules to ensure data accuracy and consistency during collection.

  • Data Cleaning: Remove duplicates, correct errors, and fill in missing values to maintain data quality.

  • Consistency Checks: Regularly compare data from different sources to identify and resolve discrepancies.

  • Automated Monitoring: Set up automated alerts and monitoring to detect anomalies or data quality issues in real-time.

Human-sourced Data

Human-sourced business data refers to information that is directly obtained from individuals, whether they are employees, customers, or prospects. This type of data is collected through various means where human input is essential, and it includes qualitative and quantitative data that provide valuable insights for businesses. Human-sourced data is somewhat more difficult to capture than other data sources, particularly when it comes from outside sources. A great deal of company resources are tied to this issue including sales, sales development, marketing and customer service. 

Types of Human-Sourced Business Data

  • Employee Data

    • Surveys and Feedback: Information gathered through employee satisfaction surveys, engagement surveys, and feedback forms. This data helps in understanding employee morale, job satisfaction, and areas needing improvement.

    • Performance Reviews: Data from regular performance appraisals, one-on-one meetings, and peer reviews. It provides insights into employee productivity, skills, and development needs.

    • Exit Interviews: Insights obtained from departing employees about their reasons for leaving, which can help identify organizational issues and improve retention strategies.

    • Skills and Training Data: Information about employees’ skills, certifications, and training needs, which aids in workforce planning and development.

  • Customer Data

    • Customer Feedback: Comments, reviews, and ratings provided by customers about products or services. This data is crucial for product development, quality improvement, and customer satisfaction.

    • Surveys and Questionnaires: Data collected from customer satisfaction surveys, NPS (Net Promoter Score) surveys, and market research questionnaires. It provides insights into customer preferences, experiences, and expectations.

    • Support Interactions: Information from customer service interactions, such as chat logs, call transcripts, and email exchanges. This data helps improve customer support processes and resolve common issues.

  • Prospect Data

    • Lead Information: Data collected from potential customers through lead capture forms, webinar registrations, and event sign-ups. This includes contact information, company details, and areas of interest.

    • Behavioral Data: Information obtained from direct interactions with prospects, such as responses to outreach emails, engagement during sales calls, and participation in product demos.

    • Survey Responses: Insights from surveys conducted with prospects to understand their needs, preferences, and buying intentions.

    • Direct Discovery: Data collected from direct interactions with prospects by company employees. 

Importance of Human-Sourced Business Data

Human-sourced business data is vital because it provides direct insights from the people most closely associated with a company’s operations, products, and services. This data helps businesses:

  • Improve Products and Services: By understanding customer feedback and preferences, companies can tailor their offerings to better meet market demands.

  • Enhance Employee Engagement: Employee feedback helps create a better work environment, leading to increased job satisfaction and productivity.

  • Optimize Marketing and Sales: Prospect data allows businesses to target their marketing and sales efforts more effectively, improving conversion rates and customer acquisition.

  • Make Informed Decisions: Direct insights from employees, customers, and prospects ensure that business strategies are grounded in real-world experiences and expectations.

Incorporating human-sourced data into company data systems involves robust data collection processes, ensuring data quality, and integrating this data with other business intelligence tools to derive actionable insights. 

In an era where data drives business decisions, the complexity and challenges of handling human-sourced data cannot be understated. The diverse nature of this data—from employee feedback to customer interactions and prospect behaviors—presents unique difficulties in terms of collection, quality assurance, and integration. Unlike digital data that can be more easily managed through technological infrastructure, human-sourced data requires meticulous efforts in ensuring accuracy, completeness, and reliability. However, the insights derived from this type of data are invaluable, offering direct perspectives from the people most integral to a company's success. By leveraging robust data collection and quality assurance processes, businesses can effectively incorporate human-sourced data into their existing data ecosystems. This integration not only enhances decision-making and strategic planning but also drives improvements in sales effectiveness, product development, employee engagement, and customer satisfaction. Embracing the challenges of human-sourced data with a structured approach allows businesses to transform raw insights into actionable intelligence, thereby maintaining a competitive edge in today's data-centric landscape.