It's hard to believe that the holidays are over. Especially if you're at my house where it still seems as if there are decorated trees both inside and out. Somehow that old idea of putting everything away on 1/1/whatever fell by the wayside this year. Soon, soon. Technology is already getting exciting. Hints of lots of hiring, which sounds good to all of us. Hints of new technologies (but not new smartphones. That one seems to have run its course for a while). Again the headlines talk about data, business intelligence, analytics, and the cloud. I'll keep you posted as I delve into the articles. Welcome back to the work world. Let's all have a great time in 2011. . .
Here's the schedule or you can view the complete schedule on our Website:
CSTA Web sessions:
January 19, 20
March 2, 3
UITJ (Understanding IT Jobs) Web sessions:
TR Web sessions:
Keep in touch - I love hearing from you - and keep up with technology!
Back to top
We're starting the new year with a look at data. As data grows, it grows in importance! Companies now have so much data they've got to pay attention to how it's stored, where it's stored, and who stores it. Some of the new data collections (think Facebook, Google, etc.) don't fit our popular relational database design so we had to come up with new designs. As cloud storage grows, we have another reason to look for alternative designs. Large databases are inevitably distributed (stored over multiple systems), and the relational perform poorly when distributed. Another way data has changed is its format. Corporate data tends to be textual, which is the easiest format to store and access. That's changing, however. Even corporate data is now expressed in graphics and is quickly moving to include audio and video formats.
Relational databases are dominant in the corporate world. These databases store data in two dimensional tables with rows and columns. Each database has a defined schema, which is the tables, columns and relationships that make up the relational database. A schema is often depicted visually in modeling software such as ERwin or drawing software such as Visio. The most common RDBMS (Relational DataBase Management Systems) used in the business world are Oracle, DB2, SQL Server, MySQL, and Sybase. The databases created under these systems tend to be very complex with hundreds of different tables and complex relationships between all the tables. This is what corporate data is. We have customers, collections, sales, invoicing, products, vendors, billing, returns, complaints, locations, etc. Each of these databases contains information that depends on, or uses information from the other databases. Data this complex should stay in relational databases. It's the new formats we have to keep up on.
The first new format isn't really so new – it's still relational. But. . . and, it's a big "but," columnar, or column-based databases handle new volumes of data and are often used for data warehousing. The internal structure is based on the columns, not the rows. This means data is retrieved by columns, putting all the like data together. Column-based databases are up to twenty times faster and require up to 90% less table storage space than traditional RDBMSs. These were designed for read-intensive workloads such as data warehouses. Hadoop and Vertica for the Cloud fit in this category.
Some of the new data corporations keep have multiple formats (video, audio, graphics), but the connections between the data aren't as complex as they are in the operational systems. Think Facebook*. The information they collect on each member is fairly simple, but depends on each individual. Some people build albums of photos, others don't even provide a profile picture. Some people post videos with audio, others post nothing but text. Some people have thousands of friends, others have only a few. Facebook uses Cassandra (as does Twitter) which is a highly scalable second-generation distributed database. Cassandra* is based on Dynamo's fully distributed design and Bigtable's (Google) ColumnFamily-based data model. Cassandra ends up being a Key/Value database, which has domains instead of tables. Domains contain items which are defined by keys which can have a dynamic set of attributes. Each item can have a unique schema and contains all the pertinent information about the item. A domain could contain customer items and order items and data are commonly duplicated between items. Data is accessed through APIs (Application Programming Interfaces) commonly following SOAP (Simple Object Access Protocol) or REST (Representational State Transfer) standards. These databases scale easily and dynamically and are good for document-oriented data and distributed scalability.
A virtual database (or federated, IBM's term for a virtual database) has been around for a while, but remains fairly steady in use. A virtual database allows users to access data in disparate databases, and perhaps even unstructured information stored in documents or email messages from a single query. These databases would not require the data to be converted to a single format.
The last term to keep in mind is No-SQL database. This is a catch-all term for any database that does not use SQL statements for access. In other words, any non-relational database. This includes databases based on Amazon's Dynamo key-value store and Google's BigTable, in addition to document databases (usually in JSON format) and graph databases such as those found in social networks.
Companies are building more and more high-volume databases for many different uses. Picture Google's database holding the indexes of all the web sites. Picture eBay, with the databases to hold the auctions and the product information. These "new" formats will become more and more common.
*Originally developed by Facebook, then turned over to Apache and is an open-source Apache project. Used by many companies, including Twitter.
Back to top
1. What's Steeper?
2. Which of the following does not belong?
3. Which is more worrisome – DDoS or SQL Injection?
4. What's so special about IBM's latest mainframe?
5. What database design is growing in popularity for large databases such as data warehouses?
Back to top
Even the most conservative look at 2011 agree that large layoffs have tapered off and the overall trend seems to favor hiring. A survey of IT managers at 136 firms in the U.S. and Canada with revenues above $50 million found that 48% of managers planned to add staff next year, with 11% planning to reduce staff. Companies are beginning to take on new IT projects, extending staff hours, hiring contractors and turning to outsourcing. David Foote, CEO of the IT workforce analyst firm, Foote Partners, said he believes the demand for IT skills is stronger and he's seeing a lot demand for specific skills by business units in areas such as predictive analytics, architecture, social media and security. There are lots of glimmers, and certainly most predictions about 2011 are better than they were for 2010. As I run into surveys and predictions, I'll use Twitter and my blog to get the information to you, but overall it sure looks like a positive year ahead.
And that's really good news!!!
Back to top
We started off the year looking at databases, so let's continue with the basics and look at processors! When you look at the fact that computers have basically doubled in effectiveness every 18 months since the 1950s, it's obvious that processors have changed a lot. Let's look at what's happening.
chip, integrated circuit An integrated circuit is actually a computer chip, and can have many functions including processing, memory, timing, etc. These chips consist of millions of transistors, which are switches, or gates, and work with the presence or absence of electricity. Chips can be either linear or digital. Linear, or analog, chips receive continuous variable data such as radio waves, and digital chips work with discrete data – binary, 1s and 0s. Transistors make up integrated circuits (or chips), and computers are made up of integrated circuits.
chipset Microprocessor architecture. A set of chips that handles a specific function such as the processor, or the complete set of chips (processor, memory, input/output) that make up the basic functions. The term is sometimes used to refer to the functionality of the motherboard.
CISC (Complex Instruction Set Computer) Type of computer processor, used in early computers, and still used in mainframe and some mid-range and desktop systems. Contrast to RISC (Reduced Instruction Set Computer) architecture used in many desktop systems.
dual-core chips Computer architecture. Computer chip with two processors. Putting more transistors into a single microprocessor increased heat, and by dividing work between two microprocessors, a chip can run faster than a single-microprocessor chip without generating excessive heat. Servers with dual-core chips were available in mid-2005, while personal computers using this technology were available by the end of 2005.
fault tolerant Terminology. The ability of a system to respond to an unexpected hardware or software failure. Computers that have redundant processors that automatically take over in the event of a failure are said to be fault tolerant. There are many levels of fault tolerance, the lowest being the ability to continue operation in the event of a power failure. Fault-tolerance can be achieved through:
Failover: a backup operation that automatically switches to a standby database, server or network if the primary system fails or is temporarily shut down for servicing. The backup system mimics the operations of the primary system. Absolutely necessary for systems that rely on constant accessibility.
Mirroring: Storage management technique used to provide fault tolerance (protecting against system failures) by writing to two duplicate disks simultaneously. This way if one of the disk fails, the system can instantly switch to the other disk without any loss of data or service.
multi-core chips Computer architecture where a single piece of silicon holds two or more processing cores. Dual-core architecture, building two processing cores on a single chip, is common, quad-core (4 processors) is growing, octo (8) exists and Interlagos processors with between 12 and 16 cores on a single chip are planned for release in 2011 by AMD.
parallel computer A computer that has multiple processors and can have each processor work on part of a problem. Considered to be the supercomputer of the future; parallel computers are faster than standard supercomputers. Used mostly in scientific and engineering applications, but quickly being implemented in the business world. Parallel computers usually have a few processors; massively parallel processors can have thousands.
RISC (Reduced Instruction Set Computer) Midrange computer. RISC machines are built with chip technology, but speed and storage capacity are equal to midrange computers so they are usually classified as midrange systems. These computers are designed with a specific and smaller instruction set to be more efficient, and are typified by fast speeds and extensive graphic capabilities. They are often called servers as they function in that capacity most of the time. The dominant operating system used with RISC systems is Unix. Originally developed by IBM, but now produced by many vendors. Alternative is CISC (Complex Instruction Set Computer).
server blades Computer architecture. Computer system built by stacking physically small parts, or blades, of a computer system in a rack to occupy less space. Each blade can include processors, memory, storage (disks) and network connections, and all share the common power and air-cooling resources of the rack (also called a chassis). The blades can be switched in and out without shutting down the system (called hot-swappable). Total systems are called blade, hyper dense, or ultra dense servers. Systems built with even more parts are called brick servers. These systems are typically used for applications such as Web hosting and scientific processing which requires multiple processors. Being produced by many vendors including IBM, Hewlett-Packard, Sun, Compaq, and several start-up companies. Released: 2001.
zIIP (z9 Integrated Information Processor) Computer architecture, specialty chip for the z9 mainframe that's designed to handle the workloads of BI (Business Intelligence), ERP (Enterprise Resource Planning), and CRM (Customer Relationship Management) applications. It is designed for data serving workloads. Released: May, 2005.
Back to top
1. I really like this question! It's another step in green IT. Chip manufacturers and European research institutions have banded together to figure out how to redesign microprocessors so that they consume less energy when in use and leak less energy when in stand-by mode. The research project aims to eliminate processor power consumption almost entirely when chips are in stand-by mode, as well as cut power usage by 10 times when in use.
2. This time it's a) that doesn't belong. WiBree is short-range wireless technology. It's an offset of BlueTooth and was developed by Nokia – then turned over to Bluetooth. Its standards for devices with button-cell batteries and low power. WiBro is WiMax in Korea, and WiFi is WiMax's predecessor. So, all are wireless but just one is in the Bluetooth family with the other three following WiFi.
3. Both are problems – Distributed Denial of Service (DDoS) is bombarding a Web site with messages, visitors, etc. so that regular business cannot be performed. This just happened to MasterCard, with DDoS attacks prohibiting real visitors from getting to their web site because of their refusal to support WikiLeaks. An SQL Injection is a hacking technique which attempts to pass SQL commands (statements) through a web application to be executed by the backend database. Hackers can use Web features such as login pages, support and product request forms, feedback forms, search pages, and shopping carts to insert SQL that allows hackers to view information from the database and/or even wipe it out. Now, we hear more about DDoS attacks, but a successful SQL injection causes perhaps permanent damage.
4. The latest system is the zEnterprise system, and what makes it special is that the operating system not only manages the mainframe computer, it can also manage applications running on servers – particularly blade servers. Hmm. Handling the integration of applications running on servers and mainframes. This is special.
5. Column-based, or columnar databases. These are relational databases, but they access data by columns rather than rows. In traditional operational processing it makes sense to retrieve a row of data – get all the information you can on one person, or one item, etc. Retrieving data by columns works better for analytical queries where it makes sense to look at all names, or all salaries to work with large amounts of data.
Back to top
SemCo Enterprises, Inc. respects your privacy. We do not sell, rent or share your information with anyone.