Posts

Showing posts from 2021

Azure Event Hubs - Namespaces Hubs Schema Registry RBAC

Image
Microsoft has added a Schema Registry to their Azure Event Hubs.  This is another feature parity checkbox for those thinking of moving from Kafka to Event Hubs. The Schema Registry at this point feels like it was created by a different team with a slightly different organizational structure and RBAC than that of the Hubs themselves.  Schema Registries are useful in a lot of circumstances. Microsoft would have been better served by Making the Schema Registry a stand-alone offering with its own Portal blade.    The video below walks through how the Schema Registry is fitted into Event Hubs. Video   Speaker Notes Namespaces Namespaces are the Azure EventHubs primary top-level organizational unit. Hub RBAC can be applied at the Namespace level. Schema Registry RBAC can be applied at the Namespace level Hubs Hubs are the individual event streams. They are topics in Kafa terms Access tokens are supported at the Hub level

Recognizing "the good old days" while you are in them

Image
I've worked for over a dozen different companies across 20+ efforts.  Three team efforts stood out from the rest.  Two of the projects had  reunions or socials up to  20 years   after the project peaked . One has a Facebook group. The other has an international mailing list. People go a lifetime without ever working with a group that has a reunion that draws people from hundreds of miles away. Most are shocked when I tell them I worked at two places where this happened. see the video below None of the projects were easy.  All involved conflict.  All involved more than 40 hours a week on a semi-regular basis. Some parts were pretty miserable.  I wanted to quit and stare at the ocean.   These special projects were all personal and professional learning experiences.  They make all other employment  just work   now that I have forgotten  the exhaustion and frustration.  I made friends with people I would work with or recommend years later.  Almost everyone involved increased their skil

Single Node Kubernetes with Docker Desktop

Image
You can run Single-node Linux Kubernetes clusters with full Linux command line support using Docker Desktop and Windows WSL2.  Docker Desktop makes this simple. This is probably obvious to everyone. I wrote this to capture the commands to use as a future reference. There is another article that discusses  Creating a multi-node Kubernetes Cluster on a single machine using Kind http://joe.blog.freemansoft.com/2020/07/multi-node-kubernetes-with-kind-and.html Windows 10 / 11 - WSL2 - Docker The simplest way to run Kubernetes Windows 10 is with the Docker Desktop and WSL2 integration.  Docker will prompt to enable WSL2 integration during installation if you are running a late enough version of Windows 10 or any Windows 11. Prerequisites Docker is installed with WSL2 integration enabled.  The docker command operates correctly when you type a docker  command in a WSL linux command prompt. Install, Enable and Verify Install Docker Enable Kubernetes Open the Settin

Associating Personas - identifying when two "people" are the same person

Image
Identifying the "same person" when they exist in multiple affiliates or multiple contact channels can be messy with a set of tradeoffs. People show up or interact with organizations with different personas. They may be customers or incident reporters or marketing contacts or someone who just happens to make an inquiry. Even customers / registered people may exist as more than one person because of mergers, identity changes personal choice, or system errors. The speaker notes below represent a subset of the comments in the video. Video Associating Personas - Images Organizations make people create accounts in order to bind those people to permissions and preferences. Accounts may provide traceability from account to person but they often don't provide the only link to that person.  People can create multiple accounts for various purposes. This means that an account may be bound to a

Customers, Leads and Prospects are different levels of info trust

Image
Companies and organizations deal with people. Sometimes they are highly confident of the person's identity or the fact that it is the same person they dealt with in the past. Sometimes they are highly confident of a reliable identity when it turns out they are actually confident in the account's identity. Other times they have information that would never meet a legal bar. Those types of identities are good enough for marketing or sales or preferences but not good enough for legal documents or other use cases. Video The video goes into more detail than the speaker notes in the slides section. Presentation Content Organizations have all kinds of different contacts with individuals and other organizations. Our confidence in knowing those individuals ranges from anonymous to highly confident.

Protecting systems for different size disasters

Image
There is a lot of complexity in hardening IT systems to survive different types of disasters. They balance user experience, cost, complexity, and risk tolerance. Which strategies might work best? Talking in metaphors: A major disaster where a lot of other people are impacted, a storm with essentially a transient effect, or a self-induced fire where we ourselves accidentally burned a room or a whole building. This blog article exists to provide static images of the slides used in the presentation Video Presentation Slides  What are you afraid of?

Organizing the Raw Zone - Data straight from the Producers

Image
The right approach for laying down raw data in your data lake or your Cloud Warehouse depends on your goals. Are you trying to ensure the data lands exactly as sent for traceability? Are you planning on transforming the data to a consumer model to decouple producers and consumers? Are you have structure, semistructured, documents, or binaries? Do you have PII exposure? Video Presentation Slides This section exists to provide static copies of the material in the video. Additional content may be added over time. We're talking as if you have a data pipeline that moves data from the producers into locations that are friendly to data consumers.  It could be a simple pipeline with just a couple steps or it could be something sophisticated that includes things like DataVault modeling layers. Two main things to think about. Who owns making the data consumable?  Are you capable of supporting an ongoing promotion process that converts data from producer schemas  to consumer sc

Identity Management - Internal, Customers and Partners

Image
Companies often manage multiple identity pools, I nternal users, B2C customers, B2B customers, Partner interactive  and , Partner M2M .  Internal, Customer, and Partners often use completely different systems for identity management, authentication  AuthN,  and authorization  AuthZ .  Their automation and identity controls are different even when their security risk profile is the same. The different user types have similar requirements but we often implement them separately. User types are often implemented and managed differently even though they should have the same top-level compliance and security requirements.  Identity systems all need to provide some basic functions. Identity Persistence Identity Creation and Deletion Identity Validation API and integration points for systems and applications. Group and role manipulation Group and role exposure. Self-management via API or Console Automation integration poin

Explorations: supporting schema drift in consumer schemas

Image
We can merge incompatible schemas in a variety of ways to make a usable consumer-driven schema.  A previous blog article described how we should treat and track breaking schema changes.  We're are going to look at a couple of ways of merging producer different dataset versions into a single consumer dataset. A new Conformed dataset with both versions Example: We have a date field where the date changes from non-zoned to one that has a timezone.  Or it changes from implicitly zoned to UTC  The date changes from one timezone to another timezone like UTC. The source system has its own schema. Initially, it sends the data tied to a timezone without any zone info.  That producer model is then pushed into a conformed schema. For the purposes of this discussion, we will assume that it just got pushed without any conversion. Eventually, the source system decides to ship the data with Timezone info or as a differe

Schema drift - when historical and current should be different datasets

Image
Data producers often create multiple versions of their data across time. Some of those changes are additive and easy to push from operational to analytical stores.  Other changes are transformational breaking changes. We want to integrate these changes in a way that reduces the amount of magic required by our consumers.  All these changes should be captured in data catalogs where consumers can discover our datasets, their versions and, the datasets they feed. Managing incompatible Producer Schemas We can take this approach when providing consumers a unified consumer-driven data model. Version all data sets and schema changes. Minor versions represent backward-compatible changes like table, column, or property additions. Major version numbers represent breaking changes from the previous versions. Data should be stored in separate raw zone data sets based on ma

Azure IoT and M5Stack with M5Flow Blocky Python - C2D

Image
The M5Flow (blocky) graphical program builder makes it easy to drag and drop a program that receives C2D messages from Azure IoT Hub.  I've been playing with the M5Stack Core2 devices and wanted to see how hard it would be to create a program that sends data Azure to an IoT device  without  having to actually  write  any code.   They support several development environments including a graphical Python builder.  The UIFlow IDE includes common cloud integration blocks for Azure and AWS.  There weren't a lot of samples out there. I hope this will help others can use this as a starting point. You can find a link to a video walkthrough down below. Azure IoT Hub Cloud-to-Device From the Microsoft guide https://docs.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-c2d-guidance IoT Hub provides three options for device apps to expose functionality to a back-end app: Direct methods for communications