RPM Commands

Reference: http://www.cnblogs.com/xiaochaohuashengmi/archive/2011/10/08/2203153.html

RPM是RedHat Package Manager(RedHat软件包管理工具)类似Windows里面的“添加/删除程序”

rpm 执行安装包
二进制包(Binary)以及源代码包(Source)两种。二进制包可以直接安装在计算机中,而源代码包将会由RPM自动编译、安装。源代码包经常以src.rpm作为后缀名。

常用命令组合:

-ivh:安装显示安装进度–install–verbose–hash
-Uvh:升级软件包–Update;
-qpl:列出RPM软件包内的文件信息[Query Package list];
-qpi:列出RPM软件包的描述信息[Query Package install package(s)];
-qf:查找指定文件属于哪个RPM软件包[Query File];
-Va:校验所有的RPM软件包,查找丢失的文件[View Lost];
-e:删除包

rpm -q samba //查询程序是否安装

rpm -ivh /media/cdrom/RedHat/RPMS/samba-3.0.10-1.4E.i386.rpm //按路径安装并显示进度
rpm -ivh –relocate /=/opt/gaim gaim-1.3.0-1.fc4.i386.rpm //指定安装目录

rpm -ivh –test gaim-1.3.0-1.fc4.i386.rpm    //用来检查依赖关系;并不是真正的安装;
rpm -Uvh –oldpackage gaim-1.3.0-1.fc4.i386.rpm //新版本降级为旧版本

rpm -qa | grep httpd      #[搜索指定rpm包是否安装]–all搜索*httpd*
rpm -ql httpd         #[搜索rpm包]–list所有文件安装目录

rpm -qpi Linux-1.4-6.i368.rpm #[查看rpm包]–query–package–install package信息
rpm -qpf Linux-1.4-6.i368.rpm #[查看rpm包]–file
rpm -qpR file.rpm       #[查看包]依赖关系
rpm2cpio file.rpm |cpio -div #[抽出文件]

rpm -ivh file.rpm  #[安装新的rpm]–install–verbose–hash
rpm -ivh

rpm -Uvh file.rpm #[升级一个rpm]–upgrade
rpm -e file.rpm #[删除一个rpm包]–erase

常用参数:

Install/Upgrade/Erase options:

-i, –install install package(s)
-v, –verbose provide more detailed output
-h, –hash print hash marks as package installs (good with -v)
-e, –erase erase (uninstall) package
-U, –upgrade=<packagefile>+ upgrade package(s)
--replacepkge 无论软件包是否已被安装,都强行安装软件包
–test 安装测试,并不实际安装
–nodeps 忽略软件包的依赖关系强行安装
–force 忽略软件包及文件的冲突

Query options (with -q or –query):
-a, –all query/verify all packages
-p, –package query/verify a package file
-l, –list list files in package
-d, –docfiles list all documentation files
-f, –file query/verify package(s) owning file

How to use RPM Commands

Reference: http://www.tldp.org/LDP/solrhe/Securing-Optimizing-Linux-RH-Edition-v1.3/chap3sec20.html

This section contains an overview of principal modes using with RPM for installing, uninstalling, upgrading, querying, listing, and checking RPM packages on your Linux system. You must be familiar with these RPM commands now because we’ll use them often in the continuation of this book. To install a RPM package, use the command:

                 [root@deep] /#rpm -ivh foo-1.0-2.i386.rpm

Take a note that RPM packages have a file of names like foo-1.0-2.i386.rpm, which include the package name (foo), version (1.0), release (2), and architecture (i386).

To uninstall a RPM package, use the command:

                 [root@deep] /#rpm -e foo

Notice that we used the package name foo, not the name of the original package file foo-1.0-2.i386.rpm.

To upgrade a RPM package, use the command:

                 [root@deep] /#rpm -Uvh foo-1.0-2.i386.rpm

With this command, RPM automatically uninstall the old version of foo package and install the new one. Always use rpm -Uvh to install packages, since it works fine even when there are no previous versions of the package installed.

To query a RPM package, use the command:

                 [root@deep] /#rpm -q foo

This command will print the package name, version, and release number of installed package foo. Use this command to verify that a package is or is not installed on your system.

To display package information, use the command:

                 [root@deep] /#rpm -qi foo

This command display package information; includes name, version, and description of the installed program. Use this command to get information about the installed package.

To list files in package, use the command:

                 [root@deep] /#rpm -qlfoo

This command will list all files in a installed RPM package. It works only when the package is already installed on your system.

To check a RPM signature package, use the command:

                 [root@deep] /#rpm  --checksig foo

This command checks the PGP signature of specified package to ensure its integrity and origin. Always use this command first before installing new RPM package on your system. Also, GnuPG orPgp software must be already installed on your system before you can use this command.

Maven

Reference:https://en.wikipedia.org/wiki/Apache_Maven

Maven is a build automation tool used primarily for Java projects. The word maven means “accumulator of knowledge” in Yiddish.[3] Maven addresses two aspects of building software: first, it describes how software is built, and second, it describes its dependencies. Contrary to preceding tools like Apache Ant, it uses conventions for the build procedure, and only exceptions need to be written down. An XML file describes the software project being built, its dependencies on other external modules and components, the build order, directories, and required plug-ins. It comes with pre-defined targets for performing certain well-defined tasks such as compilation of code and its packaging. Maven dynamically downloads Java libraries and Maven plug-ins from one or more repositories such as the Maven 2 Central Repository, and stores them in a local cache.[4] This local cache of downloaded artifacts can also be updated with artifacts created by local projects. Public repositories can also be updated.

Maven can also be used to build and manage projects written in C#, Ruby, Scala, and other languages. The Maven project is hosted by the Apache Software Foundation, where it was formerly part of the Jakarta Project.

Maven is built using a plugin-based architecture that allows it to make use of any application controllable through standard input. Theoretically, this would allow anyone to write plugins to interface with build tools (compilers, unit test tools, etc.) for any other language. In reality, support and use for languages other than Java has been minimal. Currently a plugin for the .NET framework exists and is maintained,[5] and a C/C++ native plugin is maintained for Maven 2.[6]

Alternative technologies like gradle and sbt as build tools do not rely on XML, but keep the key concepts Maven introduced. With Apache Ivy, a dedicated dependency manager was developed as well that also supports Maven repositories.[7]

Hadoop

refer to: http://hadoop.apache.org/

What Is Apache Hadoop?

The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

The project includes these modules:

  • Hadoop Common: The common utilities that support the other Hadoop modules.
  • Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
  • Hadoop YARN: A framework for job scheduling and cluster resource management.
  • Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.

Other Hadoop-related projects at Apache include:

  • Ambari™: A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a dashboard for viewing cluster health such as heatmaps and ability to view MapReduce, Pig and Hive applications visually alongwith features to diagnose their performance characteristics in a user-friendly manner.
  • Avro™: A data serialization system.
  • Cassandra™: A scalable multi-master database with no single points of failure.
  • Chukwa™: A data collection system for managing large distributed systems.
  • HBase™: A scalable, distributed database that supports structured data storage for large tables.
  • Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
  • Mahout™: A Scalable machine learning and data mining library.
  • Pig™: A high-level data-flow language and execution framework for parallel computation.
  • Spark™: A fast and general compute engine for Hadoop data. Spark provides a simple and expressive programming model that supports a wide range of applications, including ETL, machine learning, stream processing, and graph computation.
  • Tez™: A generalized data-flow programming framework, built on Hadoop YARN, which provides a powerful and flexible engine to execute an arbitrary DAG of tasks to process data for both batch and interactive use-cases. Tez is being adopted by Hive™, Pig™ and other frameworks in the Hadoop ecosystem, and also by other commercial software (e.g. ETL tools), to replace Hadoop™ MapReduce as the underlying execution engine.
  • ZooKeeper™: A high-performance coordination service for distributed applications.

Getting Started

To get started, begin here:

  1. Learn about Hadoop by reading the documentation.
  2. Download Hadoop from the release page.
  3. Discuss Hadoop on the mailing list.

Memcached

Refer to: http://www.memcached.org/about

 

About Memcached

memcached is a high-performance, distributed memory object caching system, generic in nature, but originally intended for use in speeding up dynamic web applications by alleviating database load.

You can think of it as a short-term memory for your applications.

What it Does

usage

memcached allows you to take memory from parts of your system where you have more than you need and make it accessible to areas where you have less than you need.

memcached also allows you to make better use of your memory. If you consider the diagram to the right, you can see two deployment scenarios:

  1. Each node is completely independent (top).
  2. Each node can make use of memory from other nodes (bottom).

The first scenario illustrates the classic deployment strategy, however you’ll find that it’s both wasteful in the sense that the total cache size is a fraction of the actual capacity of your web farm, but also in the amount of effort required to keep the cache consistent across all of those nodes.

With memcached, you can see that all of the servers are looking into the same virtual pool of memory. This means that a given item is always stored and always retrieved from the same location in your entire web cluster.

Also, as the demand for your application grows to the point where you need to have more servers, it generally also grows in terms of the data that must be regularly accessed. A deployment strategy where these two aspects of your system scale together just makes sense.

The illustration to the right only shows two web servers for simplicity, but the property remains the same as the number increases. If you had fifty web servers, you’d still have a usable cache size of 64MB in the first example, but in the second, you’d have 3.2GB of usable cache.

Of course, you aren’t required to use your web server’s memory for cache. Many memcached users have dedicated machines that are built to only be memcached servers.