Enterprise Data Management


New Data Architectures in Support of Emerging Architecture Styles.

To support a variety of drivers, particularly agility and cyber security, the government is embracing new software architectural styles such as:

  • Microservices
  • Serverless architecture
  • Zero trust
  • Data lakes
  • Data fabric
  • Data mesh
  • Blockchain and Distributed Ledger Technologies

These new architecture styles dramatically affect how software is organized and how the data is organized. Axiologic Solutions has SMEs that can assist with the emerging architecture/design data patterns to properly architect/design the data for these architecture styles.

Data engineering.

As data becomes increasingly critical to a variety of business domains including data science, business intelligence, cybersecurity, data mining, data migration, data analysis and business reporting (including auditing), the need for well-thought out and sound data solutions has emerged; this is commonly referred to as “data engineering”.

Though the term “data engineering” is relatively new, this function also has deep roots in systems engineering, and software engineering, particularly in ETL data processing. Simply put, data engineering is focused on helping an organization (and its many functions) locate, access, move, process, and analyze data. If data is at the center of the capability/requirement, then “data engineering” has an important role; if data is secondary, then “software engineering” or “systems engineering” disciplines take the lead.

Axiologic Solutions provides the following data engineering services:

  • Data Centric Activities
    • Data architecture
    • Modeling: Business information, conceptual, logical, physical, multi-dimensional
    • Data Warehousing, data marts
    • Preparation, transformation, curation, enrichment
    • Replication
    • Governance
    • Analysis
    • Consumption
    • Other data lifecycle activities (DAMA.org)
    • DataOps
  • Software Centric Activities
    • Architecture
    • Engineering and design
    • Development
    • Agile development
    • RMF
    • DevSecOps
    • MLDevSecOps
  • Technology
    • Architecture
    • Engineering (focus on infrastructure data storage/processing)
  • Automation
    • Architecture
    • Design
    • Integration
  • Consumption
    • Data science
    • Business Intelligence
    • Visualization
  • Operations
    • Optimize data storage infrastructure
    • Database management
    • Data operations management (per DAMA.org)
    • Change management
    • Configuration management
    • Data lifecycle management
    • Security management
    • Job/workflow execution

Data ingestion.

Data ingestion is an important area of EDM where significant resources are spent. Axiologic Solutions offers a mature set of data ingestion services, including:

  • Data Ingestion Strategy and Overall Approach
    Data ingestion is a rather generic “data pattern” for moving and processing data. It is important to clearly define the role/goals/objectives/purpose of data ingestion in the particular business/system context.
  • Ingestion Types
    The three types of data ingestion we provide expertise:
    • Batch data ingestion
    • Real-time or streaming data ingestion
    • Hybrid/Lambda architecture as the combination of real-time and batch data ingestion methods
  • Ingestion Style
    Data ingestion can follow two styles:
    • Push style: the source system sends notifications to the data ingestion process that there is “new” data available
    • Pull style: the ingestion platform regularly polls the source system to see if new data is available and then triggers the data ingestion pipeline
  • Data Ingestion Technical Approach
    We have deep expertise in all parts of the data ingestion solution, including:
    • Data Sourcing
    • Data Collection
    • Data Processing
    • Data Storage
    • Data Consumption
    • Data Visualization
  • Data Pipeline Architecture
    The large majority of data ingestion pipelines are not properly designed up-front, creating hard to maintain software that is architecturally deficient. We provide expertise to ensure that the data ingestion solution addresses the following architectural quality attributes (AQAs):
    • Performance – provide the required throughput
    • Scalability – ability to handle additional data sources, increased data volumes, additional data processing logic, increased number of users
    • Reliability – the data ingestion produces accurate results and is not overly sensitive to patterns in the data that may change over time
    • Availability – the data ingestion is capable of handling some amount of failure and can resume processing
    • Manageability – it is easy to configure the data ingestion pipeline and administer/monitor it at run-time e.g., start, stop, pause, quiesce, restart, performance reporting
    • Modifiability and extensibility – ability to extend/modify the data ingestion pipeline (new data sources, new processing logic)
    • Security – adoption of various security controls to address different security threat
    • Usability – the data produced by the ingestion must easily/directly support a business process

ML and AI Augmented Data Management.

Many EDM functions are historically performed using rules-based, hardcoded software-based logic. This logic is very brittle and requires constant change as the data changes. This leads to a perpetual EDM software maintenance “treadmill.”

What is needed is a way to automatically derive EDM logic directly from the data itself using “augmented data management.” Augmented data management uses ML, RPA, and AI techniques to mature, optimize, improve, and automate multiple parts of the data management lifecycle, reducing the overall software development /maintenance costs/effort.

Axiologic Solutions has expertise in various EDM areas – many of which can benefit greatly from greater automation via ML/AI/RPA, including:

  • Schema understanding
  • Data transformation
  • Schema comparison and mapping
  • Data quality management
  • Data filtering
  • Data creation
  • Data integration
  • Data fusion
  • Data access
  • Data understanding

Data Management (DataOps, DataSecOps, MLDataOps, Data as Code) to Support ML at Enterprise Scale.

As data is fuel to ML model creation (e.g., training data) any latency in the availability of good quality data adds to the delays of having new models updated to better reflect operational realities. DataOps (or MLDataOps or data-as-code) is a spin-off of business application software centric DevOps (or DevSecOps) and is concerned with optimizing the development, testing, deployment, execution, and monitoring of data processing workflows/pipelines (or more precisely feature pipelines) for ML model creation. DataOps has some overlap with:

  • Traditional software development as there is some software logic in the data processing pipeline
  • Infrastructure-as-code
  • DevOps/DevSecOps practices
  • Quality management, particularly statistical process controls
  • Data management practices

The focus of DataOps is how the output from the data pipeline will feed the ML model pipeline. The data pipeline is concerned with creating the feature datasets (training, validation, testing) that are used by the ML model pipeline. Since data has a completely different lifecycle from ML models (e.g., data can be continuously collected during a day, but ML models are only refreshed once per day; data can be used by multiple ML model pipelines; a ML model pipeline may use data from multiple data pipeline), we explicitly decouple DataOps from MLDevSecOps. This decoupling is a critical element to achieving machine learning at enterprise scale.

Axiologic Solutions has the expertise to assist the government’s adoption of ML at enterprise scale, focusing on the training data, using DataOps and MLDevSecOps.