Search Training

Overview:

Cloudera University’s Search training course is for developers and data engineers who want to index data in Hadoop for more powerful real-time queries and integrate Cloudera Search with external applications. This course is part of the developer learning path.

Delivery Method and Course Duration:

OnDemand: 180 days

Classroom: 3 days

Objectives:

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:

Performing batch indexing of data stored in HDFS and HBase
Indexing streaming data in near-real-time with Flume
How to index content in multiple languages and file formats
Processing and transforming incoming data with Morphlines
Creating a user interface for an index using Hue
Integrating Cloudera Search with external applications
Improving the experience using faceting, highlighting, and spelling correction

Intended Audience & Prerequisites:

This course is intended for developers and data engineers with at least basic familiarity with Hadoop and experience programming in a general-purpose language such as Java, C, C++, Perl, or Python. Participants should be comfortable with the Linux command line and should be able to perform basic tasks such as creating and removing directories, viewing and changing file permissions, executing scripts, and examining file output. No prior experience with Apache Solr or Cloudera Search is required, nor is any experience with HBase or SQL

Advance Your Ecosystem Expertise

Cloudera Search brings full-text, interactive search and scalable, flexible indexing to an enterprise data hub. Powered by Apache Solr, Search delivers scale and reliability for a new generation of integrated, multi-workload queries

Course outlines:

1. Introduction

2. Overview of Cloudera Search

What is Cloudera Search?
Helpful Features
Use Cases
Basic Architecture

3. Performing Basic Queries

Executing a Query in the Admin UI
Basic Syntax
Techniques for Approximate Matching
Controlling Output

4. Writing More Powerful Queries

Relevancy and Filters
Query Parsers
Functions
Geospatial Search
Faceting

5. Preparing to Index Documents

Overview of the Indexing Process
Understanding Morphlines
Generating Configuration Files
Schema Design
Collection Management

6. Batch Indexing HDFS Data with MapReduce

Overview of the HDFS Batch Indexing Process
Using the MapReduce Indexing Tool
Testing and Troubleshooting

7. Near-Real-Time Indexing with Flume

Overview of the Near-Real-Time Indexing Process
Introduction to Apache Flume
How to Perform Near-Real-Time Indexing with Flume
Testing and Troubleshooting

8. Indexing HBase Data with Lily

What is Apache HBase?
Batch Indexing for HBase
Indexing HBase Tables in Near-Real-Time

9. Indexing Data in Other Languages and Formats

Field Types and Analyzer Chains
Word Stemming, Character Mapping, and Language Support
Schema and Analysis Support in the Admin UI
Metadata and Content Extraction with Apache Tika
Indexing Binary File Types with SolrCell

10. Improving Search Quality and Performance

Delivering Relevant Results
Helping Users Find Information
Query Performance and Troubleshooting

11. Building User Interfaces for Search

Search UI Overview
Building a User Interface with Hue
Integrating Search into Custom Applications

12. Considerations for Deployment

Planning for Deployment
Determining Hardware Needs
Security Overview
Collection Aliasing

13. Conclusion

I. Advance Your Ecosystem Expertise

Học trực tuyến

Học tại Hồ Chí Minh

Học tại Hà Nội

Các khóa học khác

Sắp khai giảng Xem thêm

Docker Foundation
Ngày khai giảng : 12-07-2026
Triển khai, quản trị hạ tầng ảo hóa VMware vSphere [V8]
Ngày khai giảng : 13-07-2026
LINUX LPIC 1 (101-500 & 102-500)
Ngày khai giảng : 15-07-2026
Certified Information Security Manager (CISM)
Ngày khai giảng : 18-07-2026

Góc công nghệ Xem thêm

Thông tin việc làm Xem thêm

Robusta mời giảng viên cộng tác đào tạo
Ngày đăng : 23/09/2025
Tuyển dụng Nhân viên Sales & Marketing (EdTech)
Ngày đăng : 17/09/2025
DIGI-TEXX VIETNAM – Tuyển Dụng Đội Ngũ Công Nghệ
Ngày đăng : 04/08/2025
Tuyển dụng Thực tập sinh Công Nghệ Thông Tin (AI, Data Science)
Ngày đăng : 19/06/2025

Search Training

Overview:

Delivery Method and Course Duration:

Objectives:

Intended Audience & Prerequisites:

Advance Your Ecosystem Expertise

Course outlines:

Học trực tuyến

Học tại Hồ Chí Minh

Học tại Hà Nội

Các khóa học khác

Sắp khai giảng Xem thêm

Góc công nghệ Xem thêm

Thông tin việc làm Xem thêm

Tìm chúng tôi trên facebook

Địa chỉ liên hệ

Trụ sở Hồ Chí Minh

Văn phòng Hà Nội

Liên kết nhanh

Search Training

Overview:

Delivery Method and Course Duration:

Objectives:

Intended Audience & Prerequisites:

Advance Your Ecosystem Expertise

Course outlines:

Học trực tuyến

Học tại Hồ Chí Minh

Học tại Hà Nội

Các khóa học khác

Sắp khai giảng Xem thêm Xem thêm Xem thêm Xem thêm

Góc công nghệ Xem thêm

Thông tin việc làm Xem thêm

Tìm chúng tôi trên facebook

Sắp khai giảng Xem thêm