All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Information Report

Category:
## Documents

Published:

Views: 68 | Pages: 48

Extension: PDF | Download: 0

Share

Description

Project 11: Determining the Intrinsic Dimensionality of a Distribution. Okke Formsma, Nicolas Roussis and Per Løwenborg. Outline. About the project What is intrinsic dimensionality? How can we assess the ID? PCA Neural Network Nearest Neighbour Experimental Results.

Transcript

Project 11: Determining the Intrinsic Dimensionality of a DistributionOkke Formsma, Nicolas Roussisand Per LøwenborgOutlineAbout the project What is intrinsic dimensionality? How can we assess the ID? PCA Neural Network Nearest Neighbour Experimental Results Why did we chose this Project?We wanted to learn more about developing and experiment with algorithms for analyzing high-dimensional data We want to see how we can implement this into a program PapersN. Kambhatla, T. Leen, “Dimension Reduction by Local Principal Component Analysis” J. Bruske and G. Sommer, “Intrinsic Dimensionality Estimation with Optimally Topology Preserving Maps” P. Verveer, R. Duin, “An evaluation of intrinsic dimensionality estimators”Howdoesdimensionalityreductioninfluenceour lives? Compress images, audio and video Redusing noise Editing Reconstruction This is a image goingthroughdifferentsteps in a reconstructionIntrinsic DimensionalityThe number of ‘free’ parameters needed to generate a patternEx:f(x) = -x² => 1 dimensional f(x,y) = -x² => 1 dimensional Principal Component analysisPrincipal Component Analysis (PCA)The classic technique for linear dimension reduction. It is a vector space transformation which reduce multidimensional data sets to lower dimensions for analysis. It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Advantages of PCASince patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for analysing data. Once you have found these patterns in the data, you can compress the data, -by reducing the number of dimensions- without much loss of information. ExampleProblems with PCAData might be uncorrelated, but PCA relies on second-order statistics (correlation), so sometimes it fails to find the most compact description of the data. Problems with PCAFirst eigenvectorSecond eigenvectorA better solution?Local eigenvectorLocal eigenvectorsLocal eigenvectorsAnother problemIs this the principal eigenvector?Or do we need more than one?ChooseThe answer depends on your applicationLow resolutionHigh resolutionChallengesHow to partition the space? How many partitions should we use? How many dimensions should we retain? How to partition the space?Vector QuantizationLloyd AlgorithmPartition the space in k setsRepeat until convergence:Calculate the centroids of each setAssociate each point with the nearest centroidLloyd AlgorithmStep 1: randomly assignSet 1Set 2Lloyd AlgorithmStep 2: Calculate centriodsSet 1Set 2Lloyd AlgorithmStep 3: Associate points with nearest centroidSet 1Set 2Lloyd AlgorithmStep 2: Calculate centroidsSet 1Set 2Lloyd AlgorithmStep 3: Associate points with nearest centroidSet 1Set 2Lloyd AlgorithmResult after 2 iterations:Set 1Set 2How many partitions should we use?Bruske & Sommer: “just try them all”For k = 1 to k ≤ dimension(set): Subdivide the space in k regions Perform PCA on each region Retain significant eigenvalues per regionWhich eigenvalues are significant?Depends on:Intrinsic dimensionality Curvature of the surface Noise Which eigenvalues are significant?Discussed in class:Largest-n In papers:Cutoff after normalization (Bruske & Summer) Statistical method (Verveer & Duin) Which eigenvalues are significant?Cutoff after normalizationµx is the xth eigenvalueWith α = 5, 10 or 20.Which eigenvalues are significant?Statistical method (Verveer & Duin)Calculate the error rate on the reconstructed data if the lowest eigenvalue is droppedDecide whether this error rate is significantResultsOne dimensional space, embedded in 256*256 = 65,536 dimensions 180 images of rotatingcylinder ID = 1 ResultsNeural Network PCABasic Computational Element - NeuronInputs/Outputs, Synaptic Weights, Activation Function 3-Layer AutoassociatorsN input, N output and M<N hidden neurons. Drawbacks for this model. The optimal solution remains the PCA projection. 5-Layer AutoassociatorsNeural Network Approximators for principal surfaces using 5-layers of neurons. Global, non-linear dimension reduction technique. Succesfull implementation of nonlinear PCA using these networks for image and speech dimension reduction and for obtaining concise representations of color. Third layer carries the dimension reduced representation, has width M<NLinear functions used for representation layer. The networks are trained to minimize MSE training criteria. Approximators of principal surfaces. Locally Linear Approach to nonlinear dimension reduction (VQPCA Algorithm)Much faster than to train five-layer autoassociators and provide superior solutions. This algorithm attempts to minimize the MSE (like 5-layers autoassociators) between the original data and its reconstruction from a low-dimensional representation. (reconstruction error) VQPCA2 Steps in Algorithm: Partition the data space by VQ (clustering). Performs local PCA about each cluster center. VQPCA is actually a local PCA to each cluster.We can use 2 kinds of distances measures in VQPCA:1) Euclidean Distance2) Reconstruction DistanceExample intended for a 1D local PCA:5-layer Autoassociators vs VQPCADifficulty to train 5-layer autoassociators. Faster training in VQPCA algorithm. (VQ can be accelerated using tree-structured or multistage VQ)5-layer autoassociators are prone to trapping in poor local optimal. VQPCA slower for encoding new data but much faster for decoding.

Recommended

Related Search

Mary Beard, Pompeii: The Life of a Roman TownTHE IMPACT OF TECNOLOGY ONTHE PRODUCTION OF AThe Portrait Of A LadyDetermining the value of human rights for pubPresent Condition of the Stable Peace as a coForeign Policy of the EU, European External AThe Beginning of the Middle Helladic Period aGemology and the lapidary proces now and of aDETERMINING THE LEVEL OF DEFORMATION OF CONTRLab Report on Study the Transpiration of a Le

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...Sign Now!

We are very appreciated for your Prompt Action!

x