Lesson 12

Component Object Model (Conclusion)

Let us examine COM as a software component technology. First, COM provides a binary standard via its

method calling conventions and
vtables.

For example, a COM client only needs an object's CLSID, interface IDs, and interface specifications. The client does not need the internal layout of a COM object. Nor does the client have to link to a library to access a COM object. From the C++ development perspective this means that, besides the object and interface IDs, a client only needs access to a C++ class specification for each interface. Recall that these specifications contain all pure virtual functions. To load and access a specific COM object, a client directly or in-directly obtains the class factory associated with that object. Using the class factory, instances of the COM object can be created. The client's only access into a COM object is through its interface pointers. The client has no idea of how the COM object is actually implemented. COM's ability to de-couple client client-side code from server-side implementation internals is an example of how COM supports binary-level integration of software components.

(Class ID) plays a crucial role in facilitating the creation and access of a COM object

In Microsoft COM (Component Object Model), the CLSID (Class ID) plays a crucial role in facilitating the creation and access of a COM object by the client without the client needing to know the object's internal layout or link to a specific library. Here's how this mechanism works:

Data Warehousing and ETL (Extract, Transform, Load) Processes
- Data Warehousing: Knowledge of how data warehouses are structured, including star and snowflake schemas, and their use in analytical reporting.
- ETL Processes: Understanding how data is extracted from multiple sources, transformed into a suitable format, and loaded into databases or data warehouses for analysis. Efficient ETL design can improve data quality and timeliness.
Big Data and NoSQL Databases
- Big Data: Awareness of how to model and manage unstructured or semi-structured data in environments like Hadoop, Apache Spark, and similar platforms.
- NoSQL Databases: Understanding when to use NoSQL databases like MongoDB, Cassandra, and others for scalability and handling large datasets that don't fit the relational model.
Data Governance and Compliance
- Data Governance: A data modeler should understand policies around data management, including data stewardship, data ownership, and access control to ensure proper data quality, security, and privacy.
- Compliance: Familiarity with regulations such as GDPR, CCPA, and HIPAA, especially how data modeling decisions affect compliance with data protection laws.
Metadata Management
- Metadata: Understanding the role of metadata in documenting the meaning, relationships, and lineage of data. Metadata helps track where data originates, how it has been transformed, and how it should be interpreted.
Business Intelligence (BI) and Reporting Tools
- BI Tools: A data modeler should know how databases and models interact with BI tools like Tableau, Power BI, or Looker to ensure that the data structure supports efficient querying and reporting.
- Self-Service Analytics: Data modeling should consider how non-technical users will access and manipulate data through BI platforms.
Dimensional Modeling
- OLAP (Online Analytical Processing): Data modelers need to understand how to structure databases for OLAP systems, which support complex analytical queries and reporting.
- Dimensional Modeling: Skills in creating fact and dimension tables to optimize databases for analytics and reporting purposes.
Performance Tuning and Indexing
- Query Optimization: Beyond physical design, understanding how query optimization works, including indexing strategies, partitioning, and caching, ensures efficient data retrieval.
- Database Performance Tuning: This involves optimizing the database environment, such as through the use of indexes, partitioning, and table optimization to improve read/write speeds and ensure optimal database performance.
Master Data Management (MDM)
- MDM Concepts: Knowledge of how to manage master data, which serves as the single source of truth for an organization, ensuring consistency and integrity of critical business data across different systems.
Data Visualization Techniques
- Data Visualization: Awareness of how data should be structured and modeled to make visualization straightforward for end-users. Ensuring the model supports drill-downs, aggregations, and real-time analytics is important for efficient reporting and analysis.
Machine Learning and Predictive Analytics
- Data Modeling for Machine Learning: While not necessarily building machine learning models, a data modeler should understand how to structure data for training, testing, and validating predictive models.
- Predictive Analytics Tools: Understanding how to integrate the database design with tools that support statistical analysis and machine learning workflows.
Cloud Data Architecture
- Cloud Databases: Familiarity with cloud-based databases such as AWS RDS, Google Cloud BigQuery, or Azure SQL, and how cloud infrastructure impacts data modeling, storage, and access patterns.
- Serverless and Distributed Data Systems: Awareness of cloud-native solutions, including serverless databases and distributed systems, which offer flexibility and scalability.
Data Integration
- Cross-Platform Integration: Understanding how to model data for integration across different platforms and applications, enabling data flow between systems such as CRMs, ERPs, and third-party applications.
- APIs for Data Access: Knowledge of how data models can be accessed via APIs for real-time data integration between systems.
Data Quality and Cleansing
- Data Quality Frameworks: A solid understanding of techniques for maintaining high data quality, including validation, deduplication, and handling missing or erroneous data.
- Data Cleansing: Skills in preparing and cleansing data to ensure it is reliable and accurate for analysis.

In Summary: The CLSID facilitates the client’s interaction with a COM object by acting as a unique identifier that the COM runtime uses to locate the object's implementation, load it dynamically, and instantiate it. The client doesn’t need to know the internal structure of the object or link to a library because all the necessary information is abstracted by COM and retrieved dynamically through the CLSID and the Registry.

Let us summarize what we have studied in this module:

First, we were introduced to COM classes, class objects, and class factories.
We studied IClassFactory i.e., how CreateInstance creates instances of a COM object and how LockServer locks or unlocks a server.

In-process servers
1. An in-process server provides function DllRegisterServer to register its COM classes (i.e., types of COM objects). Tool RegSvr32.exe loads an in-process server and calls DllRegisterServer. For each COM class supported by the server, the following registry entries must be made:
```
HKEY_CLASSES_ROOT\CLSID\
{COM Object 1 CLSID} = "optional string value"

HKEY_CLASSES_ROOT\CLSID\
{COM Object 1 CLSID}\InprocServer32 = Full path to DLL
```
1. An optional ProgID may also be entered. COM objects must also specify a threading model or apartment type. We will defer discussion of COM threading and apartments to another course. For now, we will use the default-threading model which is single threaded.
2. Tool RegEdit.exe can be used to examine the contents of the registry.
3. DllUnRegisterServer can be invoked via RegSvr32 /u to remove a server's registry entries.
4. COM calls DllGetClassObject, when a client calls CoGetClassObject, to get a specific class object/factory from the server. Normally, the server will examine the CLSID parameter of DllGetClassObject and, if it supports the requested COM class, create a class factory and return its IClassFactory interface to COM. In turn, COM returns the IClassFactory pointer to the client.
5. COM periodically calls DllCanUnloadNow to ask the server if any of its objects are in use. If no objects are active, the calls returns TRUE (non-zero) to tell COM the server can be unloaded from memory.

COM clients

This module showed us how a COM client uses CoGetClassObject and IClassFactory::CreateInstance or CoCreateInstance to create an instance of a COM object and get its first interface pointer into the object.
We discussed how a client makes calls into a COM interface. We looked at how to call QueryInterface in one interface to get an interface pointer to another interface within the COM object. We also briefly discussed the differences between how a server and a client view a COM interface.
Finally, we looked at the coding steps in a COM client.

The knowledge we have gained from Modules 2 and 3 provides us with a solid foundation of core COM concepts and development techniques. We are now ready to move on to developing COM servers, objects, and clients with ATL--the active template library.