Important
WE HAVE DIFFERENT MOTIVATIONS and in all my career, this audience by far was the least likely to work towards the solution. and the facts are far more difficult to acquire in this domain.
Tables are common in Data experts conversations, while some will object to the term as supporters of a NoSQL environment, this position will only harm the critical conversations which need to occur across technology and stakeholder groups.
If we perform data modeling the logical model and physical model will be a 1:1 or perfect match.
Most things are either tied to or altered by the physical table which makes the process most effective when controlled by this component.
Truth
While most modeling tools intend to control the two models, the ability to alter the physical table has become widely used.
Myth
Physical tables are a SQL requirement and not in the new technology used in Big Data.
Truth
The structure of data when acquired regardless of source and irrespective of the technology will require a column definition and sequence of the columns, rows to capture each line of detail.
- When I measure waste, I begin with the physical tables.
- When I measure threats I begin with the physical tables
How to use the physical tables?
- Each column represents a header or data entry field in a source application.
- Source database and source application are two completely different things
- A source application has a user group and an associated process or procedure being followed.
- Business rules are designed into the application to guide the users actions
- Action the users must follow
- A source database has few rules and no constraints nor even if the logical model was defined to have them.
- A physical table DOES tell you what is ACTUALLY happening with the data.
When I want to profile a data quality problem?
- Step 1 understand the problem from the stakeholders perspective
- Step 2 assess the stakeholders role in the data and the process step the stakeholder work within.
- Step 3 identify the stakeholder group with the role of creating, when the data can be enriched and how the data is enriched?
- Step 4 assess the variance between the process owner, the stakeholders who would re-use the information and how the information get's to the stakeholders with the quality failure.
- Step 5 profile the data starting with the application where the data is "created" a physical table should be found with a 1:1 match of the entry points.
- If not found, a record retention process may be broken.
- Inputs must be captured and written, then validated during the close (more often usually)
- Each validation results in a record, when financial it must be retained in it's original format.
- Changes must be enriched never replacing the original records.
- Many restatements are forced and changing the source information
Meeting the Sarbanes Oxley 404 assurance-be a hero for an executive
- When I measure cost savings from waste, I measure the columns with information from the ERP master records by the number of rows returned in the profiling or daily batch report.
- When I am performing a security architecture role as the person who qualifies the projects and guides the application threat assessment during the SDLC;
- A threat assessment must be done on any create or get process introduced
- The business intelligence process should not create, while it may present information by joining two data sets, the use of in memory or a physical table writes to a new table with both parts of the information.
- Any mis-match between the source application - in the master data records we must assume any values created, updated and archived MUST be done with the MANAGEMENT CAPABILITY user group.
- These management capability users are a team of people who must associate and document a series of steps in due diligence and retain the records for audit purposes.
- generally legal, corporate finance and an operations person performing the action with both legal and finance validating the action.
a symptom of shadow IT
Duplicate indices - high power and degraded performance
- a problem with duplicate physical tables
- the table may be renamed and data structured in the same way as the source physical table
- the table may be named slightly different yet structured in the same way
- a symptom of an offline data algorithm writing back to an operational application
- I've seen this most often in CRM or marketing application
Person Centered Design
In the case of duplicate indices we must be concerned as the quality
into an operational process may force re-work and has the highest
probability of impacting the customer and external stakeholders.
"why would I use this application, the information isn't used by anyone else"
"why bother, have you seen how many customer records I'd have to go through, it's faster to just enter my customer and information"
- Employees
may re-key rather than try to find things in an application when each
day the application is flushed with backend data algorithms.
- Especially when the application is controlled and uncontrolled obviously.
- Worst case I've seen, users entered the information 7 times between the acceptance of the customer or win of an opportunity to commission crediting.
What may be threatening your customer experience?
Common issues requiring maintenance
- a large amount of null values
- a large amount of data type changes which never get changed
- column decimal format error
- data format error
- a table with zero values
Worst Case Scenario
- By the time the order booked, 7 different contractors entered into
another application at different times for a different stakeholder
within the company.
- Only when I was allowed to sit in a
bull pen with the other roles within the company did I learn of the way
we speak, and frequency of the events we process will either prompt fight to get it in or flight push it back to get it away.
- "the purpose and the motivations for each audience are highly
dependent on access to the information and frequency of transactions
processed."
- Assignment of specialized resources who only deal with the larger slower transactions significantly improves the situation.
- The systems are not setup to support the larger slower deals.
- A high transaction process will be the types most people are comfortable with the volume makes them the stakeholder to please.
- everything else must work around the volume stakeholders
- The low transaction high dollar deals are immediately non-standard and avoidance is the favored response.
- Null values
- Very few applications designed to support the customer experience allow a user to enter a null in the process workflow (sanctioned)
- A backend alogorithm isn't subject to the rule
- trust is lost by all stakeholders internally as Sales must be doing something wrong.
- Sales - not all sales and many times only a subset of sales is doing this.
- A null on the country forces a touch by all stakeholders in every process
- A null on the state or zip code - same effect
- A decimal error forces manual touches on pricing or cost which are both illegal unless you hide them in adjustments.
- A date error is typically a back door to trick a rebate or commission or performance on a business unit during any given period.
- in many cases, the forced date error will prompt a static date to be applied indicating no change when the actions in fact are performing a change showing another field to allow the illegal behavior.
- Many large temp files are another indicator of the same bad behaviors.
FACT
When I worked at Sony Corporation of American, I held a role and I was the only customer operations person who supported the direct business models for the company in the Western Region.
- Western Region - Highest sales region with Northern western region generally higher than Southern due to the number of Global accounts headquartered in Silicon Valley.
- Biggest customers - No two customers are the same
- Most demanding - forming relationships gets a team beyond this high touch part of direct business.
- Customer negotiated terms-No two legal agreements are the same. YOU CANNOT AND MUST NOT try to force a standard contract on these customers.
- You will violate the terms of the master agreement or not be working in good faith
- Use a transformation to a service work order model
When I went to work for the smallest reseller who had just won the sole source contract for all of the Western region as a system integrator for Sony's Video Conference line, I supported the highest selling team with the hardest to please customers.
As time moved on, I was asked to manage all Video Conference deals to ensure the smooth internal process which the other women were not willing to adjust to support.
The same challenges were faced that we had while working as the assigned product manager for the roll out in the Western Region rather than from New Jersey, as me and the new Director for our region were an awesome team.
Later when I made my way through a number of smaller or startups, the rule remained true in each case, a large account is in all parts of the world. We know where these customers are located. A few people in the company have access and make decisions on these accounts.
If you segment your organization first by the large accounts, again Sub-Account 30 in this example.
The segment 30 is the slow lane, high dollars but slow to travel.
The best way to manage the 30 business model is in their own applications, feeding to ERP at the time of the opportunity being validated.
- Customer commits-push 14 data fields to all downstream forms in ERP as the header and constraint people must remain in to meet the customer expectations.
- On the use case where 7 entries were made-you just reduced the cost by 6 contractors and removed the need to re-key.
- We know re-key is waste-imagine the savings you have from holding all stakeholders in the process accountable to their roles based on the date of push.
- You have enabled the customers experience
- you reduced the inbound data collection
- only new values enriched by the process will update the original table in a different physical table associated with the opportunity.
- You enabled commission crediting
- You enables supply chain
- You enabled customer service
- You enabled manufacturing or strategic alliances
- You improved data quality
- You've met the SOX requirements
- You've met the ISO requirements
- You are agile and ready to grow in all circumstances.
- You are governing the organization and have excellent operational efficiency
- Now, big data isn't so big anymore is it?
If it is still big, let's dive into the following;
- a large number of temp tables
foundation for any inbound data acquired from another database, another technology during transformation or a workflow execution and the structure which defines the outbound information dissemination. The Physical database table represents the source for any transition from traditional SQL or ETL integration to a SOA processing. The physical table in XML format will enable the transition to SOA components.
A physical db table would enable the best source for any data quality project, to understand the number of places the data is being used across the organization. Basically, a sponsor of a data quality project CANNOT tell you where others have acquired or duplicated data which intends to be and needs to be governed.
If ERP creates a customer, supplier, employee, item, component, offer or Accounting code in the master tables, any physical table with the same attributes has the potential to break the key control outside the system of record.
As a source structure in ETL workflow processing; the "Integration" or "ETL" processing will program an algorithm (typically in SQL) in any application the algorithm MUST be considered an offline or back-end transformation outside the system of record. The Sarbane's Oxley Act of 2002 was crafted to PREVENT these practices as an audit record no longer retains the original format as required for record retention purposes.
The problem with the requirement, most people who own the process have no visibility nor would they be able to comprehend the FACT that what they are accountable for the organization can be acquired outside the source application. The operational applications such as a CRM, an ATS or Supply Chain system are the source and only place data can be created to populate the reference and master tables in an ERP system.
Competing Motivations-culture
We have in one direction claimed governance and controls around the customer facing processes in response to Sarbane's Oxley increasingly complex regulations. The unfortunate dilemma was in the understanding of the law and it's intent or we failed to understand the "spirit of the law". Many companies increased operational support which allowed an entirely new layer of contract workers at every point of entry into an ERP system. The person entering the information into the workstream must be aware of the threats of jail or criminal charges one may face in any transactions which fail to meet the GAAP accounting principles. The result of this new layer of workers, significantly reduces the quality of information being entered.
Effect of the culture - 3rd party applications
How many projects have you heard where a company bought a nice 3rd party application and after it went live the company was in a worse situation? Well, let's think about the issue.
Was the purchase of a new application bad and the vendor product useless?
You must first determine if you used the features as designed.
Did you customize any parts or avoid using any fields?
Understanding that a 3rd party solution is designed for the needs of the industry, not any single company.
If you elect to or allow your staff to "elect" what they use and where in the application these choices were made will be evident in the offline reporting or business intelligence (EDW is a process within the BI Stack). The Physical table will define the information collected or acquired and in Big Data we call this PROCESSING OR CURATION.
Did the vendor product get implemented in the appropriate manner?
Verify the vendor manual to the physical table columns represent the actions or decisions made in a source and rows the events details or transaction inputs to the decision.
Physical Tables and Integrity Checks
Can you use the new 3rd party application canned reports?
Select a basic report in the 3rd party application.
For example; daily transaction report or all transaction detail reports.
Step 1.
Run daily transaction detail report
If the report is void of data in any place you have not setup the application correctly.
Step 2.
Verify the report against the physical table.
Step 3.
How many physical tables are there with the information from your application?
3NF
How many times is the table re-used?
Ideally, a 3NF physical table or the contents upon acquisition of the information will populate many views.
A single view for in memory, the same view segmented by region (column) such customer country or location code from the six segments in ERP.
The least level of security will be far easier to execute in a model where the rule is as described then further segmented by 3 segments (sub-account in erp) 10, 20, or 30 enables resources for resale and distribution in account code 10, assign all resellers to country and account code 10. dimensions then multi-byte processing against the region only.
Does the report have any missing information? My first lesson in the failure between the chair and the keyboard, began when I printed a canned report out of the newly implemented accounting system.
It was the year 2000, we were integrating 4 acquisitions and transitioning from 5 accounting systems (4 as 1 was a complex spreadsheet managing 70% of the revenue).
My consultant rushed in a crystal reports expert who designed these custom reports to address the situation. The problem, we used so many user defined fields that our system was not being used to perform the work we bought it to do for us. Therefore nothing was going to work. I was still unaware but skeptical. I was ready to tie the consultants to a stake when the 3rd party project accounting system turned out to have bad code and I learned any corrections were proprietary and would cost me x more than the negotiated maintenance contract.
When a person outside the data architecture, data operations and MDM space inquire into anything related to BI, EDW or operational use of Enterprise Business Data the response will always be limited, never inclusive and without regard for business governance and quality expectations.
In fact, it is a widely know fact that data people will not recognize the process owners or the dilemma a process owner faces.
In fact, it is beginning to surface that data people will not recognize the requirements from a records management stakeholder.
In both cases, we can imagine the best place to begin in restoring the right visibility for both process and records management owners.
never given any indication of the way data is used by the "self-proclaimed" owners of all data.
Extracts using a workflow template which defines the physical table structure which host the data being transformed. The ETL workflow Extracts from and places into a work table where any algorithm will be performed as the workflow transforms before placing the work table into a target table.
Profiling in comparison from the source application, source database table and or the best indicator of unique offline applications.
A physical db table IS NOT limited to the process of modeling.
although most tools begin with a logical model, transforms into a physical table and intends to conform the two in delivery of any usage. The logical and physical tables when not matched may be identified on an exception report or a some refer to this as a Swiss Cheese report. When physical tables are replicated in a 3NF and databases require structured physical tables to perform any business intelligence, custom analytical and transactional applications for "data at rest" and during operational activities begins to collect from a specific table or in 3NF acquires from many tables based on the type of Business Intelligence or EDW event processing during the batch or a workflow between data stores. We refer to the processing as data "in flight".
re-use and a snapshot of the "actual" activity in any data warehouse operation. An ideal state ensures the consistency between the logical model and physical model as a way to govern the data as defined by the modeler. The modeler has been tasked with ensuring the request is defined well and performs a series of qualifying questions to ensure the information meets the needs and is understood by the requesting sponsor of a project.