on Sunday, 30 June 2013

Data Cleaning

"Data cleaning is one of the three biggest problems in data warehousing - Ralph Kimball"
"Data cleaning is the number one problem in data warehousing - DCI survey"

Data cleansing, data cleaning or data scrubbing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set,table, or database. Used mainly in databases, the term refers to identifying incomplete, incorrect, inaccurate, irrelevant, etc. parts of the data and then replacing, modifying, or deleting this dirty data.

After cleansing, a data set will be consistent with other similar data sets in the system. The inconsistencies detected or removed may have been originally caused by user entry errors, by corruption in transmission or storage, or by different data dictionary definitions of similar entities in different stores. Data cleansing differs from data validation in that validation almost invariably means data is rejected from the system at entry and is performed at entry time, rather than on batches of data.


on Saturday, 29 June 2013
Apple’s iOS 7 is growing quickly with more tablets and iPhones running the beta software than were running iOS 6 at the same time last year, according to new data released by mobile web optimizing company Onswipe. The startup found that by July 1, 2013, 0.28 percent of all iPad visits to its mobile-optimized sites were from devices running iOS 7, and as of June 17, 0.77 percent of all iPhones making Onswipe visits were also on the new beta OS.

iOS7


on Friday, 28 June 2013

Overview


Data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to contain many errors. Data preprocessing is a proven method of resolving such issues. Data preprocessing prepares raw data for further processing.

Data preprocessing is used database-driven applications such as customer relationship management and rule-based applications (like neural net works).

Why we need Data Preprocessing ?
  1. Data in the real world is dirty
  • incomplete : the value of attribute doesn't complete, attribute that must exist but it just not exist, or just aggregate data is available
  • noisy : contain error or outliers
  • inconsistent : there is discrepancies in coding and value
  • redundant data
     2.  No quality data, no quality mining results (garbage in, garbage out)
  • quality decisions must be based on quality data 
  • data warehouse needs a combination of data which is have a certain quality
     3.  Data extraction, cleaning, and transformation is an important part for data warehouse

on Thursday, 27 June 2013
Generally, task in Data Mining divided by 2 task :
  • Predictive
         uses multiple variables to predict the unknown value or future value of other variables
  • Descriptive
         determine the patterns that can be interpreted by who is person describe the data

And more details are :
  • Classification (predictive)
  • Grouping / Clustering (descriptive)
  • Association Rules (descriptive)
  • Sequential Pattern (descriptive)
  • Regressive (predictive)
  • Deviation Detection (predictive)

on Wednesday, 26 June 2013

Overview



The open group architecture framework ( TOGAF ) may be a framework and technique for enterprise architecture that provides a methodology to investigate the overall business architecture 4. there may be four domain architecture that would be typically accepted as the main overall enterprise architecture 4. the fourth domain has actually been supported by TOGAF, namely :

1. Business Architecture : this architecture defines the business strategy, rules, organization, and key business processes.

2. Data Architecture : this architecture describes the structure as to firmly the organizations data assets.

3. Application Architecture : the architecture provides a blueprint for application systems deployed, their interactions and also their relation towards the core business processes as to firmly the organization.

4. Technology Architecture : architecture describes the hardware elements as to firmly the software were required to support the business architecture, data and application. 


Distributed Problem-Based Learning


RULES

  • Learning begins with presentation problems that exist. Explain line large situation problem and that attribute. Here also explain the learning process and determine task learning.
  • The learner convey perception first from existing problems.
  • Then, analyzing learner existing problems and also analyze perception first on problem The.
  • After analyzing learner problem and perception first them, the learner can improve perception first they with perception new or with issues related with problem The.
  • And in phase Finally, the learner convey critical ideas appropriate with argument or perception that has obtained.
Impacts

  • The learner can learn provide argument and ideas with good
  • With many arguments are collected, the open also insight and science to the learner / gain science new
  • The learner can know strategy to conclude solution from problem
  • Solutions from problem mentioned can more accurate because many arguments presented, can add value from solution and decrease error
MEDIA
I think the media is the most suitable forum (group discussion), here the problem can be described and any arguments and ideas solutions can be delivered each learner by way of posting or comment


on Tuesday, 25 June 2013
Highly developed technology in human life today. Everyone needs a technology that is continually updated to meet the needs of each human life. One technology that is rapidly expanding computer technology.And one important component in the computer is the network (network). Computer network between computers can be made to communicate with each other and access information. In this case, network maintenance is essential for making the communication between computers a smooth and good. So people just need a good network management concepts and the right to resolve the issue.
Network
Network Mapping

on Monday, 24 June 2013
Data mining ( the analysis step as to actually the knowledge discovery in databases method, or KDD ), an interdisciplinary subfield of laptop science, happens to firmly be the computational method of discovering patterns in giant data sets involving strategies with the intersection of artificial intelligence, machine learning, statistics, and database systems. the overall goal as to actually the data mining method often to extract information a data set and transform it into an understandable structure for more use. aside coming from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and on-line updating.

the notion of could be a buzzword, and is frequently misused to mean any sort of large-scale data or information processing ( collection, extraction, warehousing, analysis, and statistics ) however is additionally generalized to any more than a little laptop call support system, as well as artificial intelligence, machine learning, and business intelligence. in the correct use as to actually the word, the key term is discovery citation required, commonly defined as detecting anything new. even the popular book data mining : practical machine learning tools and modules with java ( that covers mostly machine learning material ) was originally as being named barely practical machine learning, and therefore the term data mining was no more than added for selling reasons. typically the additional general terms ( giant scale ) data analysis, or analytics – or when referring to actual strategies, artificial intelligence and machine learning – are additional appropriate.
on Sunday, 23 June 2013
The technology we are familiar with a man who accompanied the methods / products resulting from the use of science to improve the quality of human values ​​and the current has changed. Technology changes with human needs. The development of this technology is also influenced by the need to determine the quality of the good or bad of a multimedia information. Multimedia Services Quality Assessment is one of the methods that are solutions to these problems. This method is used to measure multimedia quality received by the user and transmitted with operators in order to get multimedia are of good quality.


on Saturday, 22 June 2013
Computer Vision is a field of science that studies how computers can recognise the object being observed. The main purpose of science is how a computer can mimic the eye's ability to analyse images or objects. Computer Vision is a combination of Image Processing (related to the transformation of the image) and Pattern Recognition (relating to the identification of objects in the image).
How a computer can read the object? Spoken instructions include color computer (how many variations of a point), the limits and differences in color on certain parts, consistent points (corners), information area or shape, and intensity of physical movement or displacement. As for the supporting functions in the Computer Vision system is the image capture process (Image Acquisition), image processing (image processing), image data analysis (Image Analysis), understanding the process of image data (Image Understanding)
on Friday, 21 June 2013
Highly developed technology in human life today. Everyone needs a technology that is continually updated to meet the needs of each human life. One technology that is rapidly evolving computer technology. It was inevitable that computer technology is growing rapidly, especially in information processing. Developed such computers by humans to help finish the job. This development is driven by a passion and desire of man who is never satisfied. Development of rapidly evolving technology will give birth to the latest computer technology called Quantum Computing.

on Thursday, 20 June 2013

Overview

The Zachman Framework is an Enterprise Architecture framework for enterprise architecture, which provides a formal and highly structured way of viewing and defining an enterprise. It consists of a two dimensional classification matrix based on the intersection of six communication questions (What, Where, When, Why, Who and How) with six levels of reification, successively transforming the abstract ideas on the Scope level into concrete instantiations of those ideas at the Operations level. The Zachman Framework is a schema for organizing architectural artefacts (in other words, design documents, specifications, and models) that takes into account both whom the artefact targets (for example, business owner and builder) and what particular issue (for example, data and functionality) is being addressed. The Zachman Framework is not a methodology in that it does not imply any specific method or process for collecting, managing, or using the information that it describes. The Framework is named after its creator John Zachman, who first developed the concept in the 1980s at IBM. It has been updated several times since.

on Wednesday, 19 June 2013

The technology we are familiar with a man who accompanied the methods / products resulting from the use of science to improve the quality of human values ​​and the current has changed. Technology changes with human needs. Including the needs of people in the business process. With the emergence of a variety of business challenges such as globalisation, financial condition, and political challenges, making the business itself needs to combine the need for information where there are strategies and technology-based approach.

Customer Master Data Management is a solution in the current business needs require efficiency and sometimes difficult to understand customer value. In addition, the master data is also aiming to achieve a single view of the customer, single view of product, account master record, and a single view of location. How to make strategy implementation? To achieve the best strategy, we have to look at the instructions / reference in making business framework, and the application of the best ecosystems, and how the implementation of the scenarios can be created. In the system of Master Data Management (MDM), there is an identification number (ID) and the MDM system ID of any derivative related to MDM. And the derivative exists at every system and keep MDM ID.
on Tuesday, 18 June 2013
before discussing how computation just before the advent of computers, computing definition itself could be a solution to solving problems using mathematical concepts ( compute ). at now, the majority of the computation related to computers. all numerical and analytical problem solving such problems is to utilise a laptop. however how the computation might well be done previous onto the laptop ? during this post i will be able to discuss how humans will perform computation and its history before computers were invented.

in ancient times, before there have been computers, folks only use simple tools to solve a problem. the tools used by humans is usually derived from nature. for instance stone, grain, or wood. for the very first could be a traditional calculators and mechanical calculators abacus, that appeared about 5000 years ago in asia minor. this tool might well be thought of as an early computing machine. this tool allows users to perform calculations and its use is by sliding grains arranged utilizing a rak. a seller within the whole future by using the abacus to calculate trade transactions. though, in the advent of pencil and paper, particularly in europe, the abacus lost its popularity.
If in the previous post I explained how computing can be done without a computer, so in this post I will tell you about how a computer, which is a new computing engine, can be created in this world. I will first explain how the computer there, to how computers can develop until today. In this post I will explain the five generations of computers in the world.

First Generation

Computer generation zama was created in the second world war. At first the computer is used to assist the military field. In 1941, Konrad Zuse, a German engineer to build a computer, the Z3, to design airplanes and missiles. Another computer development in this generation is the ENIAC. ENIAC is a versatile computer (general purpose computer) that work 1000 times faster than previous computers. Then came the computer which has a Von Neumann architecture models, models with a memory to hold both programs and data. This technique allows the computer to stop at some point and then resume her job back. UNIVAC computer name is I (Universal Automatic Computer I). In this generation, computers only do a certain task and the physical size of the computer in this generation is very large due to a very large component wear anyway.