eNewsletter 3
1Volume X, Number 10, November, 2010



Send TechConnections to a Friend now! Forward to a Friend!

The holidays are a'comin!


It's really the start of the holiday season. I actually have come to believe that Halloween does kick the whole thing off, and I like that. There's something about little kids all in costumes that's fun for everyone. Speaking of fun – watch the growth of software appliances. The software consists of development tools, applications (or even a combination of both) bundled with a scaled-down operating system. This bundle can then be deployed to any standard server or to the cloud. Another fun growth area is unified communications - combining real-time communication services such as instant messaging (chat), IP telephony, video conferencing, and call control with non real-time communication such as voicemail, e-mail, SMS (Short Message Service) and fax. We want computing and access anytime, from anywhere. And we're getting it.

Here's the schedule or you can view the complete schedule on our Website:

CSTA Web sessions:
November 3, 4
December 15, 16
January 19, 20
March 2, 3

UITJ (Understanding IT Jobs) Web sessions:
December 16
March 3

TR Web sessions:
December 1
February 2

Keep in touch - I love hearing from you - and keep up with technology!

Back to top

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TechKnowledge


NoSQL Databases

NoSQL databases are popping up all over. These are called next generation databases, or data stores, and are non-relational, distributed, open-source, and horizontally scalable. They're designed for the Web, and used by the Web companies – Google, Amazon, Facebook, Salesforce, etc. They're also used by the rest of us, especially in the cloud.

This really changes the database world. We're so used to relational, SQL databases and the common knowledge they represent. We know about tables, rows and columns, stored procedures and triggers, and especially SQL. Even though specifics vary from Oracle to DB2 to SQL Server, the basic knowledge is the same. Relational databases work beautifully for our operational systems such as payroll, inventory, HR, sales, customers, etc. These databases can be large (but they're not huge), they run internally in our data centers, and grow steadily in a manner that's easy to predict and control. Perfect for the relational design. Consider, however, Web databases. Facebook's Cassandra is my favorite example. Facebook has over 500 million members; this means 500 million records. People join all over the world, and this database is distributed across multiple data centers with multiple servers in multiple locations. Then, each record has a variety of information that varies greatly from member to member. How many photos have you uploaded? How much detail did you put in your profile? How many friends to you have? This information does not fit nicely into rows and columns. We need new models.

Column stores, or column families
These databases – or data stores – group columns in a family that is expected to be accessed together. Google's BigTable is the grand daddy here – most of the others are based on it. Cassandra falls into this category, as do Hadoop, Hbase, and SimpleDB.

Document stores
Often called a row-based database. Documents are grouped together in rows and accessed by rows. Examples include MongoDB and RavenDB.

Key-Value databases
These databases use domains instead of tables, and the items within a domain do not have to have the same schema (unlike the rows in a table in a relational database). Keys are used to access the domains. BerkeleyDB and Azure fit in here.

Other types of NoSQL databases are self explanatory – there are databases that hold specific types of data including: graphs, grids, objects, and XML. In addition there are multi-value databases.

Multi-Value databases (MVDB)
These databases have relational properties, but use files and records rather than tables. MVDBs are extremely flexible and fast. Current MVDBs include: U2 (Universe and UniData), and Reality.

While most of these databases are fairly new, multi-value databases have been around since the 1970s. In other words, there have always been alternatives to relational database. Now, however, we have a specific need for an alternative because of the Web. The column-stores, document-stores and key/value databases are growing in use. No longer can we assume that all database specialists have the same general knowledge. New types of databases = new knowledge. When we're talking about databases on the Web, we've got to make sure we know what we're talking about.

Back to top

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TechCheck


1. What country hosted the first Robotic Olympics in June of this year?

2. What's RIM's (Research in Motion) latest entry in competition with Apple?

3. Which of the following does not belong?
Azure
Cassandra
Hadoop
Vertica

4. What's the #1 OS (operating system) for smartphones in 2010?

5. What two major companies are working together to build tools for creating applications for the cloud?

Back to top

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

How has 2010 gone?


I ran across Gartner's survey of CIO (Chief Information Officer) goals from January, 2010. It's really interesting to read at the end of the year. The technical goals were:
Virtualization
Cloud Computing
Web 2.0
Networking, Voice and Data Communications
Business Intelligence
Mobile Technologies
Data/Document Management and Storage
Service-Oriented Applications and Architecture
Security Technologies
IT Management
It looks pretty accurate to me. I only quarrel with one item – and that's with the expression, not the fact. #3, Web 2.0 covers social networking and blogging – all those sites where the content is supplied by the visitor, not by the site owner. Think eBay in addition to Facebook. My only complaint is the phrase "Web 2.0." I don't think I've heard or seen it used in months. It's one of those terms that was created to explain something new, and then just disappeared as the newness wore off.

Keep the list around so when Gartner does the 2011 survey you can compare. It will give us all a snapshot of how dynamic 2011 will be in the eyes of CIOs. I'll make sure I get that survey to you.
Back to top

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Short Vocabulary


MDM

MDM (Master Data Management) is becoming more and more important as companies start collecting more and more data. There's got to be a way to integrate data from different sources, to deal with different types of data, and to protect data. MDM is a growing answer to those problems.

MDM (Master Data Management) Data integration concept that focuses on managing reference data, which is also called master data. Master data describes core business entities such as customers, locations, products, etc. It includes both traditional structured information and unstructured content such as documents and images. No single business application can provide all the core information on, e.g., a single customer. Information on a customer's individual purchases is captured by the sales system, credit card information is found in the billing system, and complaints and returns are in the CRM (Customer Relationship Management) system. The master data concept concentrated on a single system to build and maintain data that can then be accessed and created by the operational systems. Both CDI (Customer Data Integration) and PIM (Product Information Management) systems are vertical (industry specific) MDM systems. MDM functionality is also often included in ERP (Enterprise Resource Planning), ETL (Extract, Transform, Load), EAI (Enterprise Application Interface), and BI (Business Intelligence) systems also include MDM modules.

data assurance Data assurance states that data must have:
Consistency; all aspects of the data must be available (also called completeness)
Accuracy; information must be correct (also called correctness)
Currency; information must be timely (also called relevancy).
Some definitions add the characteristics of:
Validity; information must be important to the business
Uniqueness; no duplicates are contained in the data.
Data assurance is also called data quality, and data quality assurance.

data governance Data management terminology. Refers to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise. Usually includes a governing body, a set of procedures, and a plan to execute the procedures. Owners, or custodians, of data assets are defined as are policies that specify who is accountable for various aspects of the data, including its accuracy, accessibility, consistency, completeness, and updating. These processes cover how the data is to be stored, archived, backed up, and protected from mishaps, theft, or attack. In addition, a set of controls and audit procedures to ensure ongoing compliance with government regulations are included.

data integration Data management technology and products that provide access to data from multiple, diverse sources, and automatically integrates like data from different sources across hundreds of data formats and applications, both within and outside of the enterprise. Key trends in data integration include: extreme data integration scalability, distributed architectures for data integration, virtual and federal data integration, and real-time data integration. Integration can be accomplished by creating a new database from many sources, or by transforming data from one source to the other when integrating two sources. Data hubs and ETL (Extract, Transform, Load) products also perform data integration functions.

data mining Analyzing data from large data sets to detect trends and associations. Querying data collections with no expectations of the results. Data mining also describes using data from legacy systems for current management decisions. Commonly used with data warehousing, but can be used with any large collection of data. Report mining, text mining, and Web mining are variations of data mining that work completely with unstructured data.

data migration The process of moving data from one storage device to another, or moving data from one system to another. When data is moved to a new system, this includes converting the data to a new format. Data is migrated to new DBMSs (DataBase Management Systems), new application systems including ERP (Enterprise Resource Planning) systems, and data warehouse systems. Often included in ETL (Extract, Transform, Load) and text mining software.

data scrubbing, data cleansing Data management function. Validating data for accuracy. Includes eliminating duplicates and inconsistencies. Often used with data warehousing, migrating legacy systems to newer technologies. Estimates state that up to 70% of the cost of implementing a data warehouse is eliminating "dirty data." Also called data cleansing..
Back to top

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Answers to TechCheck


1. The World's first International Humanoid Robots Olympic Games kicked off on June 21, in China's Harbin's Institute of Technology. Nineteen teams, from China, United States, Japan, South Korea and Germany brought their best robots to compete in this three-day event. To enter the competition, robots had to be less than 60 cm long, and have a human shape, with a head, two arms and two legs. The robots had to compete in multiple challenges (24 to be exact) ranging from boxing, to weight-lifting, dancing, or sprint. In addition, there were some unusual domestic events, like cleaning or medical care.

2. The PlayBook – competition for the iPad. Just released in September, it's a tablet computer used for conferencing, Web surfing, email, gaming, and more. It's taking the iPad head-on.

3. d) does not belong. It is a columnar database (relational database that processes data by columns instead of rows) rather than a NoSQL database which the others are. A NoSQL database does not use SQL and there are multiple kinds of architectures for these databases.

4. This is a sort of trick. The answer is is Symbian, with 41% of the market. iOS (April had 14%, RIM (Blsackberry had 18% and Andriod (Google) had 17%. The trick part is that this is world -wide. In the US, Blackberry is #1 for existing phones and Andriod is #1 in new purchases. Gartner predicts that Symbian and Andriod will be #1 and #2 by 2014.

5. It's Google and VMWare who are building tools so developers can build, deploy and manage applications within any cloud environment on any device. The collaborative projects that will be available in the next two weeks include Spring Roo and Google Web Toolkit, Spring Insight and Google Speed Tracer, SpringSource Tool Suite and Google Plugin for Eclipse.

Back to top

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Privacy Policy


SemCo Enterprises, Inc. respects your privacy. We do not sell, rent or share your information with anyone.

   
Contents
The holidays are a'comin!
Teaser
TechKnowledge
TechCheck
Answers to TechCheck
Short MDM Vocabulary
How has 2010 gone?
   
SemCo's Newsletter

TechConnections is SemCo's free monthly newsletter that features important IT articles and a unique perspective on IT for the non-technical professional.


   
Teaser
What are smartphones most used for?


TechConnections Archived Editions

If you receive the Text version of this newsletter and you'd like to view it in HTML, join our Resources membership, then click on "Register Today."



If you have a technical question while reading TechConnections or if you would like to make a suggestion, send us a quick email - we'll respond, usually within 24 hours!
Back to top

Contact us at:

SemCo Enterprises, Inc.
P. O. Box 195427
Winter Springs, FL 32719-5427
800.860.2179
semco@semcoenterprises.com
http://www.semcoenterprises.com

Copyright © 2010 SemCo Enterprises, Inc. All Rights Reserved (but feel free to quote it, think about it and forward to others.)

You are subscribed as shodges@semcoenterprises.com. To unsubscribe please click here.