搜索 - 腾讯云开发者社区-腾讯云

文章/答案/技术大牛

发布

来自专栏爱生活爱编程
prophet outliers异常值
例子代码 https://github.com/lilihongjava/prophet_demo/tree/master/outliers # encoding: utf-8 """ @author: /data/example_wp_log_R_outliers1.csv') m = Prophet() m.fit(df) future = m.make_future_dataframe /data/example_wp_log_R_outliers1.csv') m = Prophet() m.fit(df) future = m.make_future_dataframe(periods /data/example_wp_log_R_outliers2.csv') m = Prophet() m.fit(df) future = m.make_future_dataframe(periods 参考资料： https://facebook.github.io/prophet/docs/outliers.html
1.2K20发布于 2021-01-14
来自专栏hsdoifh biuwedsy
Data cleaning: missing values and outliers detection
Lectures 4 and 5: Data cleaning: missing values and outliers detection -be able to explain the need for “3rd April 2016”) Age=20, Birthdate=“1/1/2002” Two students with the same student id Outliers value (if skewed distribution) Fill in Category mean -be able to explain the importance of finding outliers random error or variance in a measured variable Noise should be removed before outlier detection Outliers -be able to explain how a histogram can be used to detect outliers, their relative advantages/disadvantages
58840发布于 2021-05-19
来自专栏生物信息学、python、R、linux
去除箱线图中的outliers
当遇到一组数据中有少量outliers,一般是需要剔除，避免对正确的结果造成干扰。我们可以通过箱线图来检测并去除outliers. 首先定义一个函数，将outliers替换成NA。 remove_outliers <- function(x, na.rm = TRUE, ...) { qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm * IQR(x, na.rm = na.rm) y <- x y[x < (qnt[1] - H)] <- NA y[x > (qnt[2] + H)] <- NA y } 删除含有outliers (NA)的行 library(dplyr) df2 <- df %>% group_by(element) %>% mutate(value = remove_outliers(value))
5.1K20发布于 2020-12-23
来自专栏拓端tecdat
R语言Outliers异常值检测方法比较
本文选自《R语言Outliers异常值检测方法比较》。
30810编辑于 2023-12-02
来自专栏拓端tecdat
R语言Outliers异常值检测方法比较
本文选自《R语言Outliers异常值检测方法比较》。
33410编辑于 2023-12-02
来自专栏自然语言处理
贷款违约预测-Task3 特征工程
= data_mean + outliers_cut_off data[fea+'_outliers'] = data[fea].apply(lambda x:str('异常值') if x (fea+'_outliers')['isDefault'].sum()) print('*'*10) 正常值 800000 Name: id_outliers, dtype: int64 Name: term_outliers, dtype: int64 term_outliers 正常值 159610 Name: isDefault, dtype: int64 ******** *** 正常值 800000 Name: employmentTitle_outliers, dtype: int64 employmentTitle_outliers 正常值 159610 正常值 792471 异常值 7529 Name: pubRec_outliers, dtype: int64 pubRec_outliers 异常值 1701 正常值
1.6K20发布于 2020-09-22
来自专栏FreeBuf
如何检测TLS beaconing
ee-outliers 是用于检测存储在 Elasticsearch 中的事件的异常值的工具，这篇文章中将展示如何使用 ee-outliers 检测存储在 Elasticsearch 中的安全事件中的预备 ee-outliers ee-outliers 完全在 Docker 上运行，因此对环境的要求接近于零。创建配置文件 GitHub 上 ee-outliers 的默认配置文件中包含了所需要的所有配置选项。 run_model=1test_model=0 运行 ee-outliers 配置好模型后，运行 ee-outliers 来查看结果。 /config" -i outliers-dev:latest python3 outliers.py interactive --config /mappedvolumes/config/outliers.conf
89530发布于 2019-05-29
来自专栏图像处理与模式识别研究所
异常检测算法比较
LocalOutlierFactor matplotlib.rcParams['contour.negative_linestyle'] = 'solid' #设置参数 n_samples=300 outliers_fraction =0.15 n_outliers=int(outliers_fraction*n_samples) n_inliers=n_samples-n_outliers #比较异常值/异常检测方法 anomaly_algorithms = [ ("Robust covariance",EllipticEnvelope(contamination=outliers_fraction)), ("One-Class SVM ",svm.OneClassSVM(nu=outliers_fraction,kernel='rbf',gamma=0.1)), ("Isolation Forest",IsolationForest (n_neighbors=35,contamination=outliers_fraction)) ] #定义数据集 blobs_params=dict(random_state=0,n_samples
57150编辑于 2022-05-29
来自专栏数据科学（冷冻工厂）
空间转录组学: 局部异常检测
", point_size = 0.2) + ggtitle("Local Outliers (Mito Prop)") # plot using patchwork (p1 / p2) | ( ", annotate = "sum_outliers", point_size = 0.5) + xlab("sum_outliers") # z-transformed detected genes and outliers p2 <- plotObsQC(spe, plot_type = "violin", x_metric = "detected_z", annotate = "detected_<em>outliers</em>", point_size = 0.5) + xlab("detected_outliers") # z-transformed ", annotate = "subsets_mito_percent_outliers", point_size = 0.5) + xlab("mito_outliers
21510编辑于 2025-09-17
来自专栏AI篮球与生活
实战干货｜Python数据分析消费者用户画像
outliers Out[15]: array([0, 0, 0, ..., 1, 0, 0]) In [16]: data["outliers"] = outliers # 添加预测结果 df[ "outliers"] = outliers # 原始数据添加预测结果 In [17]: # 包含异常值和不含包单独处理 # data无异常值 data_no_outliers = data[data ["outliers"] == 0] data_no_outliers = data_no_outliers.drop(["outliers"],axis=1) # data有异常值 data_with_outliers = data.copy() data_with_outliers = data_with_outliers.drop(["outliers"],axis=1) # 原始数据无异常值 df_no_outliers = df[df["outliers"] == 0] df_no_outliers = df_no_outliers.drop(["outliers"], axis = 1) In [18]: data_no_outliers.head
1.8K11编辑于 2023-11-30
来自专栏数据学术商业新闻
模型拟合好不好！？可视化展示一下
check_collinearity() 可视化展示如下： plot(result) Example Of check_collinearity() 「样例三」：检查异常值(Check for Outliers ) mt1 <- mtcars[, c(1, 3, 4)] # create some fake outliers and attach outliers to main df mt2 <- rbind (mt1, data.frame(mpg = c(37, 40), disp = c(300, 400), hp = c(110, 120))) # fit model with outliers model <- lm(disp ~ mpg + hp, data = mt2) result <- check_outliers(model) #Warning: 2 outliers detected (cases () 方式二：bars indicating influential observations plot(result, type = "bars") Example02 Of check_outliers
1.1K20编辑于 2021-12-09
来自专栏大数据智能实战
离群点异常检测及可视化分析工具pyod测试
classifiers.items()): print() print(i + 1, 'fitting', clf_name) # fit the data and tag outliers levels=[threshold, Z.max()], colors='orange') b = subplot.scatter(X[:-n_outliers , 0], X[:-n_outliers, 1], c='white', s=20, edgecolor='k') c = subplot.scatter (X[-n_outliers:, 0], X[-n_outliers:, 1], c='black', s=20, edgecolor='k') [a.collections[0], b, c], ['learned decision function', 'true inliers', 'true outliers
1.7K20发布于 2019-05-26
来自专栏机器学习/数据可视化
KMeans+降维，实现用户聚类！
:array([0, 0, 0, ..., 1, 0, 0])In 16:data["outliers"] = outliers # 添加预测结果df["outliers"] = outliers # 原始数据添加预测结果In 17:# 包含异常值和不含包单独处理# data无异常值data_no_outliers = data[data["outliers"] == 0]data_no_outliers = data_no_outliers.drop(["outliers"],axis=1)# data有异常值data_with_outliers = data.copy()data_with_outliers = data_with_outliers.drop(["outliers"],axis=1)# 原始数据无异常值df_no_outliers = df[df["outliers"] == 0]df_no_outliers = df_no_outliers.drop(["outliers"], axis = 1)In 18:data_no_outliers.head()Out18:查看数据量：In 19:data_no_outliers.shapeOut19
1.2K71编辑于 2023-11-09
来自专栏数据派THU
独家 | 用LLM实现客户细分（上篇）
from pyod.models.ecod import ECOD clf = ECOD() clf.fit(data) outliers = clf.predict(data) data["outliers "] = outliers # Data without outliers data_no_outliers = data[data["outliers"] == 0] data_no_outliers = data_no_outliers.drop(["outliers"], axis = 1) # Data with Outliers data_with_outliers = data.copy( ) data_with_outliers = data_with_outliers.drop(["outliers"], axis = 1) print(data_no_outliers.shape) 最后，必须分析聚类的特征，这部分是企业决策的决定性因素，为此，将获取各个聚类数据集特征的平均值（对于数值变量）和最频繁的值（分类变量）： ‍ df_no_outliers = df[df.outliers
1.1K10编辑于 2023-10-31
来自专栏素质云笔记
无监督︱异常、离群点检测一分类——OneClassSVM
. [-1.76587184, -2.50357511]]) 离群值X_outliers—— 2*2 array([[-2.60871078, -1.94353134], * np.random.randn(20, 2) X_test = np.r_[X + 2, X - 2] # Generate some abnormal novel observations X_outliers = clf.predict(X_outliers) n_error_train = y_pred_train[y_pred_train == -1].size n_error_test = y_pred_test [y_pred_test == -1].size n_error_outliers = y_pred_outliers[y_pred_outliers == 1].size # plot the line [:, 0], X_outliers[:, 1], c='gold', s=s) plt.axis('tight') plt.xlim((-5, 5)) plt.ylim((-5, 5)) plt.legend
8.8K60发布于 2018-01-02
来自专栏自然语言处理
机器学习(二十一) 异常检测算法之IsolationForest
# 200条数据（X+2,X-2）拼接而成 X = 0.3 * rng.randn(20, 2) X_test = np.r_[X + 2, X - 2] # 基于分布生成一些观测正常的数据 X_outliers contamination='auto') clf.fit(X_train) y_pred_train=clf.predict(X_train) y_pred_test=clf.predict(X_test) y_pred_outliers = clf.predict(X_outliers) # 画图 xx, yy = np.meshgrid(np.linspace(-5, 5, 50), np.linspace(-5, 5, 50)) plt.scatter(X_test[:, 0], X_test[:, 1], c='green', s=20, edgecolor='k') c = plt.scatter(X_outliers [:, 0], X_outliers[:, 1], c='red', s=20, edgecolor='k') plt.axis('tight') plt.xlim((-
2K30发布于 2019-09-19
来自专栏机器学习与统计学
Duke@coursera 数据分析与统计推断unit6introduction to linear regression
correlation of X with Yis the same as of Y with X properties (6) the correlation coefficientis sensitive to outliers remainder of the variability isexplained by variables not included in the model ‣ always between 0 and 1 outliers in regression ‣ outliers are points that fall away fromthe cloud of points ‣ outliers that fall horizontally center of the cloud but don’t influence the slope of the regressionline are called leverage points ‣ outliers
61720发布于 2019-04-10
来自专栏EmoryHuang's Blog
OCSVM 学习笔记
=0.1) clf.fit(X_train) y_pred_train = clf.predict(X_train) y_pred_test = clf.predict(X_test) y_pred_outliers = clf.predict(X_outliers) n_error_train = y_pred_train[y_pred_train == -1].size n_error_test = y_pred_test [y_pred_test == -1].size n_error_outlier = y_pred_outliers[y_pred_outliers == 1].size # plot the line b2 = plt.scatter(X_test[:, 0], X_test[:, 1], c='blueviolet', s=s, edgecolors='k') c = plt.scatter(X_outliers [:, 0], X_outliers[:, 1], c='gold', s=s, edgecolors='k') plt.axis('tight') plt.xlim((-5, 5)) plt.ylim
1.4K20编辑于 2022-10-31
来自专栏DeepHub IMBA
使用孤立森林进行无监督的离群检测
[:, 0], normal_data[:, 1]) plt.scatter(outliers[:, 0], outliers[:, 1]) plt.title("Random data points with outliers identified.") plt.show() 可以看到它工作得很好，可以识别边缘周围的数据点。 top_5_outliers = data_scores.sort_values(by = ['Anomaly Score']).head() plt.scatter(data[:, 0], data[ :, 1]) plt.scatter(top_5_outliers['X'], top_5_outliers['Y']) plt.title("Random data points with only 5 outliers identified.") plt.show() 总结孤立森林是一种完全不同的异常值检测模型，可以以极快的速度发现异常。
83810编辑于 2022-04-14
来自专栏翻译scikit-learn Cookbook
Using KMeans for outlier detection使用KMeans进行异常值检测
It's important to note that there are many "camps" when it comes to outliers and outlier detection. On the other hand, outliers can be due to a measurement error or some other outside factor. This is the most credence we'll give to the debate; the rest of this recipe is about finding outliers These are the potential outliers: 首先我们生成一个100个点的群，然后找出5个离形心最远的点，它们是潜在的离群值： from sklearn.datasets import For those playing along at home, try to guess which points will be identified as one of the five outliers
2.3K31发布于 2019-11-26

第 2 页第 3 页第 4 页第 5 页第 6 页第 7 页第 8 页第 9 页第 10 页第 11 页

点击加载更多

prophet outliers异常值

Data cleaning: missing values and outliers detection

去除箱线图中的outliers

R语言Outliers异常值检测方法比较

R语言Outliers异常值检测方法比较

贷款违约预测-Task3 特征工程

如何检测TLS beaconing

异常检测算法比较

空间转录组学: 局部异常检测

实战干货｜Python数据分析消费者用户画像

模型拟合好不好！？可视化展示一下

离群点异常检测及可视化分析工具pyod测试

KMeans+降维，实现用户聚类！

独家 | 用LLM实现客户细分（上篇）

无监督︱异常、离群点检测一分类——OneClassSVM

机器学习(二十一) 异常检测算法之IsolationForest

Duke@coursera 数据分析与统计推断unit6introduction to linear regression

OCSVM 学习笔记

使用孤立森林进行无监督的离群检测

Using KMeans for outlier detection使用KMeans进行异常值检测

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

prophet outliers异常值

Data cleaning: missing values and outliers detection

去除箱线图中的outliers

R语言Outliers异常值检测方法比较

R语言Outliers异常值检测方法比较

贷款违约预测-Task3 特征工程

如何检测TLS beaconing

异常检测算法比较

空间转录组学: 局部异常检测

实战干货｜Python数据分析消费者用户画像

模型拟合好不好！？可视化展示一下

离群点异常检测及可视化分析工具pyod测试

KMeans+降维，实现用户聚类！

独家 | 用LLM实现客户细分（上篇）

无监督︱异常、离群点检测 一分类——OneClassSVM

机器学习(二十一) 异常检测算法之IsolationForest

Duke@coursera 数据分析与统计推断unit6introduction to linear regression

OCSVM 学习笔记

使用孤立森林进行无监督的离群检测

Using KMeans for outlier detection使用KMeans进行异常值检测

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

无监督︱异常、离群点检测一分类——OneClassSVM