Monthly Archives: January 2013

Amazon CloudSearch & Apache Solr 3.6

With the advent of the “Information Age”, massive amounts of data & information is being added into our lives every second. Gone are the days of MB & GB. Today everything is in the order of TB and PB. Raw … Continue reading

Posted in Technical | Tagged , , , , | 2 Comments

Hive – A Petabyte Scale Data Warehouse using Hadoop

Web has been growing rapidly in size as well as scale during the last 10 years and shows no signs of slowing down. Statistics shows us that every year more data is produced than all of the previous years combined. … Continue reading

Posted in Technical | Tagged , , , , | Leave a comment

Hive for Retail Analysis

People always look for convenience! In the early 20th century, retail industry was still in its infancy taking baby steps across Europe and North America. But the latter half of the 20thcentury saw the emergence of the hypermarket and the … Continue reading

Posted in Technical | Tagged , , , , , , , | Leave a comment

Introduction to Big Data and Hadoop Ecosystem

We live in the data age! Web has been growing rapidly in size as well as scale during the last 10 years and shows no signs of slowing down. Statistics show that every passing year more data gets generated than … Continue reading

Posted in Technical | Tagged , , , , , , , , , , , , | Leave a comment

Solr in AWS: Shards & Distributed Search

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, database integration, and geospatial search. It is highly scalable, providing distributed search and … Continue reading

Posted in Technical | Tagged , , , , | Leave a comment

Alfresco: Configure various features

Audits 1. Open <ALFRESCO_HOME>/tomcat/shared/classes/alfresco-global.properties & add the following – ### Audits ### audit.enabled=true audit.alfresco-access.enabled=true audit.alfresco-access.sub-actions.enabled=true audit.filter.alfresco-access.default.enabled=true audit.filter.alfresco-access.transaction.user=~null;.* audit.filter.alfresco-access.transaction.type=cm:folder;cm:content;st:site audit.filter.alfresco-access.transaction.path=~/sys:archivedItem;~/ver:;.* 2. Download audit-dashlet-0.43.jar wget http://share-extras.googlecode.com/files/audit-dashlet-0.43.jar 3. Copy audit-dashlet-0.43.jar into <ALFRESCO_HOME>/tomcat/shared/lib folder 4. Login as admin and add the audit dashlet on the dashboard. … Continue reading

Posted in Technical | Tagged , , , , , , , | Leave a comment

Alfresco – Replication HOWTO

In EVERY Alfresco instance, follow these steps – 1. In order to find out the repositoryId, use the following command – curl -uadmin “http://<ip_address&gt;:<port>/alfresco/s/cmis” | grep cmis:repositoryId 2. Open <ALFRESCO_HOME>/tomcat/shared/classes/alfresco-global.properties and add the following – ### Replication ###replication.transfer.readonly=true 3. Open <ALFRESCO_HOME>/tomcat/shared/classes/alfresco/web-extension/share-config-custom.xml … Continue reading

Posted in Uncategorized | Leave a comment